Andrew Shrout

Logo


I am a Data Analyst, passionate about working with organisations that want to make effective use of their data; from the first ideas around what you want to measure and how to capture data accurately, through data preparation, analysis and management, all the way to impactful presentation of insights.
View My LinkedIn Profile

View My GitHub Profile

Real Estate ETL Pipeline

Project description: For a proptech startup, I acquired >4M data points a month, and cleaned them

I set up multiple scrapers on digitalocean droplets, and automated them with cronjobs to scrape bimonthly. They reported to telegram with telemon, and deposited the data into an S3 bucket. This triggered Cloudwatch (for reporting), and lambda functions which would clean the data and deposit into an AWS RDS. It would then be fetched client side via CQL, and a Geoserver set up on an EC2 instance.

1. Scrapers

Apartments.com Scraper
Craigslist Scraper
Airbnb Scraper
Redfin Scraper

2. Cleaning Scripts

Readme
Craigslist Cleaner
Apartments Cleaner
AirBnB Cleaner
Cron Script

3. Jupiter Notebook Data Exploration / Modeling

Analysis + Readme
Craigslist LA Data Analysis
Crime Data Exploration + Modeling

4. SQL

SQL Examples

4. Map App / Visualizations

See Mapstack page.