Remote Remote (Country) Employment

TechBiz Global is hiring a Data Engineer – Web Scraping & ETL

About the Role

The role involves building and optimizing data extraction workflows, ensuring data accuracy and reliability, and supporting analytics initiatives through robust pipeline architecture.

Responsibilities

  • Develop and manage automated web scraping frameworks for diverse online sources
  • Design scalable ETL pipelines to process unstructured and semi-structured data
  • Ensure data integrity and consistency across ingestion and transformation stages
  • Monitor and troubleshoot data workflows for performance and reliability
  • Collaborate with data analysts and scientists to understand data requirements
  • Optimize data storage solutions for efficient querying and access
  • Implement error handling and retry mechanisms in data collection systems
  • Maintain documentation for data pipelines and source configurations
  • Evaluate new data sources for integration potential
  • Apply data validation techniques to ensure quality standards
  • Support compliance with website terms of service and data usage policies
  • Work with security teams to ensure ethical data collection practices
  • Improve data processing efficiency through automation and tooling
  • Respond to data quality incidents with root cause analysis
  • Participate in code reviews and system design discussions
  • Integrate third-party APIs into existing data workflows
  • Scale infrastructure to handle increasing data volume and velocity
  • Use version control for pipeline development and deployment
  • Stay current with changes in website structures and anti-bot measures
  • Contribute to data governance and metadata management practices

Nice to Have

  • Master’s degree in a technical discipline
  • Experience with large-scale distributed data processing tools like Spark
  • Background in natural language processing or text extraction
  • Knowledge of browser automation tools such as Puppeteer or Selenium
  • Experience with proxy rotation and IP management for scraping
  • Familiarity with CAPTCHA-solving techniques and tools
  • Contributions to open-source data engineering projects
  • Published work or projects involving public web data analysis

Compensation

Competitive salary with performance-based bonuses

Work Arrangement

Hybrid remote with office availability in major cities

Team

Collaborative data engineering team within a growing technology division

Technology Stack

  • Primary languages: Python, SQL
  • Frameworks: Scrapy, BeautifulSoup, Selenium
  • Cloud: AWS (S3, EC2, Lambda, CloudWatch)
  • Orchestration: Apache Airflow
  • Databases: PostgreSQL, MongoDB
  • Containerization: Docker, Kubernetes
  • Monitoring: Prometheus, Grafana

Data Ethics Policy

  • All data collection must comply with website terms of service
  • Respect for robots.txt and crawl-delay directives is mandatory
  • No personal data collection without explicit consent
  • Regular audits of data sources for compliance
  • Transparency in data usage and retention practices

Available for qualified candidates

Freelancing without stability?

Get steady projects, keep your freedom

Iglu connects you with international clients and handles contracts, payments, and admin. You get consistent work and flexibility — no more chasing invoices or worrying about gaps.

Consistent client projects
Contract & payment management
Flexible work schedule
Revenue-sharing compensation
See open positions
Work from anywhere
About company
TechBiz Global
TechBiz Global provides recruitment service to TOP clients from its portfolio.
All jobs at TechBiz Global Visit website
Job Details
Department Engineering
Category data
Posted a month ago