Data Collection and Data Quality Engineering
- Students set up an ETL pipeline from scratch using public API’s. Students used Python and leveraged Amazon Web Service EC2 & S3 to extract Twitter data around music artists, and later set up a pipeline to make this into a database of tweets and relevant attributes. Students leveraged Amazon Web Service EC2 and S3 again to create a secondary and different workflow to confirm its accuracy.
- Students analyzed Video Virality by exploring the creation of additional UGC videos (third-party generated content on YouTube’s network). Students leveraged Amazon Web Services EC2.
- Mentor: Rick Saporta
- Students: Zach Levine, Ziad Amir