Skip to main content

Using Multimodal AI: Analyze Images, Summarize Documents, and Generate Content

Code Date Delivery Cost
its-1961
  • Mon May 18, 1:00 pm to 4:00 pm
Live Online : 1 session $325

Before each live online session, Tech Training will provide a Zoom link for live online classes, along with any required class materials.

Explore how AI tools can work with more than just text. Practice using Stanford AI Playground and free versions of ChatGPT or Claude to analyze images, extract insights from documents, and generate rich, varied content.

Program Description

AI has moved well beyond text-in, text-out. Today's AI tools can look at a chart and explain what it shows, read a dense PDF and pull out the key points, and generate images, formatted documents, and structured content from a simple prompt. For professionals who work with a mix of media, this opens up possibilities that can dramatically reduce the time spent on analysis, synthesis, and content creation. This course gives participants direct experience with multimodal AI across a range of realistic workplace scenarios, building practical skills and the judgment to know how to get the most out of these tools.

Learning Objectives

Learners will have the opportunity to:
1. Explore how multimodal AI differs from text-only tools and what it can and cannot do reliably
2. Practice uploading and analyzing images, including charts, diagrams, screenshots, and photographs
3. Use AI to summarize, extract key information from, and ask questions about uploaded documents
4. Experiment with generating images and visual content using AI tools
5. Work with AI to produce structured content like reports, presentations outlines, and formatted summaries from mixed inputs
6. Develop strategies for prompting multimodal AI effectively and evaluating the quality of its outputs

Topic Outline

Topics include:
- Introduction to multimodal AI capabilities and limitations
- Image analysis and interpretation
- Document summarization and extraction
- AI image generation
- Structured content creation from mixed inputs
- Prompting strategies for multimodal tasks

Custom training workshops are available for this program

Technology training sessions structured around individual or group learning objectives. Learn more about custom training

Special Group Rates

For groups of 5 or more within the same team or department, special rates are available. Please contact techtraining@stanford.edu for more details.


University IT Technology Training sessions are available to a wide range of participants, including Stanford University staff, faculty, students, and employees of Stanford Hospitals & Clinics, such as Stanford Health Care, Stanford Health Care Tri-Valley, Stanford Medicine Partners, and Stanford Medicine Children's Health.

Additionally, some of these programs are open to interested individuals not affiliated with Stanford, allowing for broader community engagement and learning opportunities.