Competitive Intelligence from SEC filings with AWS Data Lake

Table of Contents


Competitive intelligence gives the ability to capture, analyze and act on intelligence related to market, products, demography, etc. Leaders need to get insight into the sector and the competitor to understand the movements and make decisions. Much of the data collection and analysis process from public sources like SEC 8-K, 10-Q, and 10-K filings is a time-consuming manual process. A US-based, no-code NLP solution provider firm wanted to build a corporate research platform for its client that provides faster and more comprehensive customer and competitor insights to drive sales and strategy.


  • Handling structured and unstructured data in SEC filings, including XBRL and free-form text.
  • Designing a flexible, simple, and scalable infrastructure and data pipeline to handle the complexity of the data processing.
  • Supporting complex and custom analytics at different stages of data processing and publishing the results to distribution channels such as Snowflake, Kafka, and S3, as well as last-mile tools such as Excel and dashboards.


As an AWS partner, Digital Alpha has extensive experience helping clients design and implement data lakes on the AWS platform. They built a corporate research platform and set up a data lake to enable clients to store and process their data efficiently.

Our solution included the following components:

  • A producer that ingests data into the raw bucket at periodic intervals.
  • AWS Glue for data processing and analysis.
  • AWS Glue Data Catalog for storing metadata in a central repository.
  • AWS Lambda and Step Functions for scheduling and orchestrating AWS Glue ETL jobs.
  • Amazon Athena for interactive queries and analysis.
  • Various AWS services for logging, monitoring, security, authentication, authorization, alerting, and notification.

The client gathered and analyzed competitive intelligence more quickly and accurately, enabling them to make data-driven decisions and gain a competitive edge in their market. Our AWS data lakes expertise allowed us to provide a flexible and scalable platform that met the client’s specific needs and requirements.


Digital Alpha helped the client consume raw XBRL and non-XBRL data from customers’ and competitors’ SEC reports, making critical business decisions and bolstering competitive insights.

The proprietary automated solution helped them convert structured and unstructured data to a readable and machine-processable JSON format, eventually publishing results to Snowflake data share. They utilized a fully automated solution to perform various analytics, run quant models, and conduct slicing and dicing of available data.

Finally, they established a data pipeline that removed human intervention, eliminated manual errors, and achieved deployment at speed and scale.

The following list encompasses some advantages:

  • Smart text extraction using NLP
  • Complex data from unstructured formats like XBRL to structured data
  • Scalable, reliable, and secure data platform
  • Real-time analytics
  • Generate signals from competitor/customer performance by revenue, product line, customer positioning, order backlog, etc

Related Posts