2015年06月19日 - 初稿
In data pipeline system and configuration management systems, it’s very common that you need execute a bunch of jobs which has dependencies with each other.
Write a program
pipeline_runner to execute a list of shell scripts. The definition of those scripts and their dependencies are described in a JSON file. The program only takes in one argument which is the file path of JSON file that defines the jobs.
To run the program
As you can see, each job has its input files and output files.
- A job will only be executed if all its input files exist.
- A job can have multiple input files (or none) but only produce one output file.
- Users could run the program multiple times, but if a job’s output file already exists, the program would skip the job.
If you’re still not very clear, think of
Makefile in Linux systems. The logic is quite similar.
You could complete the test with the programming language you preferred.