When you’re working with big data in a distributed, parallel processing environment like Hadoop, job scheduling and workflow management are vital for efficient operation. Schedulers enable you to share resources at a job level within Hadoop; in the first half of this chapter, I use practical examples to guide you in installing, configuring, and using the Fair and Capacity schedulers for Hadoop V1 and V2. Additionally, at a higher level, workflow tools enable you to manage the relationships between jobs. For instance, a workflow might include jobs that source, clean, process, and output a data source. Each job runs in sequence, with the output from one forming the input for the next. So, in the second half of this chapter, I demonstrate how workflow tools like Oozie offer the ability to manage these relationships.