The AI CIC team employed a systematic approach, leveraging AWS services to create a robust, scalable solution that can quickly remediate PDF documents. The project started with an in-depth analysis of requirements to be compliant with WCAG 2.1 Level AA standards and a review of the accessibility issues present in Ohio State's documents. This formed the basis for developing a remediation process that used AI and machine learning to automatically identify and correct accessibility gaps. Our solution is designed to provide both efficiency and cost-effectiveness, reducing per-page remediation expense to a fraction of a penny. Below are the key AWS services used to develop this solution:
- Amazon S3: Used to securely store and manage the documents being remediated
- AWS Lambda: Automates the file processing workflows
- ECS (Fargate): Handles document processing efficiently
- AWS Step Functions: Coordinates the various processes involved in splitting, processing, and merging documents
- Amazon Bedrock: Generates alt text for images and charts using advanced LLM capabilities
For bulk processing, 10 pages would cost approximately $0.013130164 + Adobe API costs.
This approach ensures that your document remediation processes remain efficient and cost-effective, even when managing large volumes.
Optimized for scalability, the solution takes 3.5 minutes to remediate a typical 17-page PDF, making it ideal for large institutions managing high volumes of documents. Leveraging AWS services and Adobe Auto-Tag APIs, this solution encourages good accessibility compliance at minimal cost. Institutions can scale remediation efforts efficiently while keeping costs low, perfect for large document repositories needing both speed and compliance.
A key element of the solution was the use of Adobe Auto-Tag APIs which was designed to automatically clean metadata, apply appropriate tags, and further enable document remediation. The project involved continuous iterations and testing, allowing the team to refine the AI model and achieve high compliance rates efficiently.
Industry Impact and Problem Solving
The challenges faced by Ohio State in making its documents accessible are not unique. Many educational institutions, cultural heritage institutions, government agencies, and businesses struggle with complying with accessibility standards, especially given the vast number of legacy documents that exist. The solution developed in partnership with the AI CIC addresses these challenges by providing an automated, scalable approach to document remediation. This not only helps Ohio State meet regulatory expectations but also increases access to the Libraries’ digital assets for all students, faculty, and researchers.
“Our scale is massive, and we are committed to doing what's best for our patrons. With the introduction of new accessibility standards, AI and machine learning offer what may be the most viable path to success, given our resources and scope. I'm excited about the potential of using AI to enhance the experience for those we serve.”
Cory Tressler, Assistant Dean for Technology and Digital Programs, The OSU Libraries
Potential for Wider Application
The success of this project demonstrates the potential for wider application across other educational institutions, cultural heritage institutions, government agencies, and businesses or any organization facing similar challenges with document accessibility. The AI-driven solution can be customized to meet the needs of different organizations, working to achieve accessibility at scale for large document collections.