Nigel Gebodh
  • Home
  • Publications
  • Talks & Media
  • Projects
  • Connect
    • Google Scholar
    • LinkedIn
    • Twitter(X)
    • GitHub
    • YouTube
    • BlueSky
    • nigel.gebodh@gmail.com

On this page

  • What exactly is Prompt Engineering?
    • Anatomy of Prompts
  • The Math
    • Prompting
    • Prompt Templating
    • Few-Shot Prompting
  • What exactly is AWS Bedrock?
    • AWS - Getting Started
    • AWS - Setting up Bedrock
    • AWS - LLM API Templates
    • AWS - Example
  • What exactly is Clinical Data?
    • Clinical Narrative Summarization
      • What do we expect?
      • Model Output
      • What do we see?
    • Diagnosis Generation
      • What do we expect?
      • Model Output
      • What do we see?
    • Clinical Classification
      • What do we expect?
      • Model Output
      • ● Example 1
      • ● Example 2
      • ● Example 3
      • What do we see?
    • Structured Data Extraction
      • What do we expect?
      • Model Output
      • What do we see?
  • Putting it all together
  • Take Away

Building LLM Clinical Data Analytics Pipelines with AWS Bedrock

Leveraging LLMs like Claude and Gemma for clinical data analysis

LLM
NLP
AWS
Bedrock
Claude
Gemma
Llama
Math
Author

Nigel Gebodh

Published

April 11, 2025


Cover_image
Image by FLUX.1 and Nigel Gebodh.


Large Language Models (LLMs) have transformed the way un/structured data analytics is performed.

Here we leverage prompt engineering techniques and LLMs on Amazon Web Services (AWS) Bedrock (and Hugging Face) to build an analytics pipeline specific to healthcare applications with clinical data.

You can jump straight to the application and experiment with different clinical analytics scenarios/tasks here.

LLM word generation
LLM clinical data analysis app using Anthropic’s Claude-3 Haiku.

What exactly is Prompt Engineering?

Prompt Engineering

Prompt Engineering:

  • Is the process of creating a prompt or input (i.e. text, audio, image, etc.) to a Generative AI model or LLM in order to retrieve or guide an output.
  • It generally involves crafting systematic instructions that guide an LLM to effectively perform a task.

Anatomy of Prompts

There are several key aspects to effectively creating a prompt to feed to an LLM. We’ll leverage combinations of these prompt components in our data analysis pipeline.

To get a more in-depth understanding of the state of the art in prompting, check out these papers (Brown et al., 2020; Qiao et al., 2023; Schulhoff et al., 2025).

These are some of the basic prompt aspects (You can test them with different LLMs here):

Directive

The user’s intent for the LLM. These can be explicitly stated as an instruction or implied with the text format or examples provided (one example → referred to as one-shot)

Example: We want a direct set of information returned

  • Explicit: Instruction given on what output to produce

    • INPUT:
      Generate a list of the five longest rivers in the world.

    • OUTPUT:
      Nile River, Amazon River, Yangtze River, Mississippi River, Yenisei River

  • Implicit: Task implied with structure of the text

    • INPUT:
      New York City: United States
      Paris:

    • OUTPUT:
      France

Output Formatting

If a structured output is needed (JSON, CSV, XML etc.) from the LLM, specifying and defining the desired output guides the model to return the desired format. This is also helpful for evaluating a model’s performance if you have a structured ground truth reference/expected output to compare to the models output.

Example: We want JSON structured (basically a nested dictionary) information returned. We can give an example or have the LLM extract and classify the information to JSON tags (dict keys).

  • INPUT:
    Return the following text in `JSON` format: My name is Sam Mitra. I’m currently 28 and I live in Redmond, WA. I have 2 dogs and 1 cat.

  • OUTPUT:

    {
    "name": "Sam Mitra",
    "age": 28,
    "location": "Redmond, WA",
    "pets": [
        {"type": "dog", "count": 2},
        {"type": "cat", "count": 1}
            ]
    }
Style Instructions

These are instructions that control the LLM’s output style by including descriptive adjectives like “clear”, “concise”, “casual”, or “professional” in our prompts, which directly affect the vocabulary used.

Example: We want a clear, concise definition of what a dog is

  • INPUT:
    Give a clear and concise definition of what a dog is

  • OUTPUT:
    A dog is a domesticated carnivorous mammal (Canis lupus familiaris) that belongs to the family Canidae.

Example:
We want a casual, friendly definition of what a dog is

  • INPUT:
    Give a casual and friendly definition of what a dog is

  • OUTPUT:
    Dogs are basically the best! They're furry, four-legged friends that are all about giving you love, snuggles, and slobbery kisses.

Role

This is a persona or perspective that you define the LLM to have in producing outputs. This can influence the output style and content returned.

Example: Define the LLM’s role as a “biomedical engineer and medical doctor”

  • INPUT:
    As a biomedical engineer and medical doctor, give a short, concise explanation of what sea sickness is.

  • OUTPUT:
    Sea sickness, also known as motion sickness, is a complex phenomenon resulting from the mismatch between the vestibular system's perception of motion and the visual and proprioceptive systems' input.

Example: Define the LLM’s role as a “dog”

  • INPUT:
    As a dog, give a short, concise explanation of what sea sickness is.

  • OUTPUT:
    WOOF WOOF! whine Oh no, oh no! Sea sickness is when my hooman gets all wobbly and queasy on the boat!

The Math

Effective prompt engineering relies on the fact that modern LLMs are trained on vast datasets, enabling them to understand and follow instructions.

Generally, we can think of LLMs as autoregressive functions that when given a sequence of tokens (e.g. words), produce a probability distribution for a set of next tokens. These word probabilities are then fed through a Softmax function which helps to normalize the probability distribution and select the token with the highest probability (greedy decoding). See (Gebodh, 2024b) for a dive into Softmax calcualtions and LLM temperature variations.

LLM word generation
A visual recap of LLM word generation.

We can parameterize this process as:

\[ P( \mathbf{Y} | \mathbf{X}) = \prod_{t=1}^{|\mathbf{Y} |} p_{\text{LM}}(y_t | y_{1:t-1}, \mathbf{X}) \]

This basically reads as:

  • What is the conditional probability of \(\mathbf{Y}\) given input \(\mathbf{X}\), where \(\mathbf{X}\) are tokens that can be text, images, video, audio etc.
  • We calculate the probability of the output (\(\mathbf{Y}\)) by multiplying (\(\prod\)) the probabilities of each of the tokens (\(t\)), using previous output tokens (\(y_{1:t-1}\)) and the input (\(\mathbf{X}\)).

Prompting

Prompting

For prompt engineering we can think of the input (\(\mathbf{X}\)) as having two parts:

  • \(\mathcal{T}\) - the prompt, meaning the general instruction or task you want the LLM to do
    • Example \(\mathcal{T}\): Translate from English to French:
  • \(\mathcal{Q}\) - the specific example or question that you’re passing to the LLM
    • Example \(\mathcal{Q}\): The cat and the dog sleep

The full input with returned output would be:

  • Input: \(\mathcal{T}\), \(\mathcal{Q}\): Translate from English to French: The cat and the dog sleep
  • Output: \(\mathbf{Y}\) or \(\mathbf{A}\): Le chat et le chien dorment

We can rewrite our general LLM parameterization as:

\[\begin{align*} P( \mathbf{Y} | \textcolor{Plum}{\mathbf{X}}) & = \prod_{t=1}^{|\mathbf{Y} |} p_{\text{LM}}(y_t | y_{1:t-1}, \textcolor{Plum}{\mathbf{X}}) \\ P( \mathbf{Y} | \textcolor{Plum}{\mathcal{T} }, \textcolor{Plum}{\mathcal{Q}}) & = \prod_{t=1}^{|\mathbf{Y} |} p_{\text{LM}}(y_t | y_{1:t-1}, \textcolor{Plum}{\mathcal{T} }, \textcolor{Plum}{\mathcal{Q}}) \\ \end{align*}\]

For convention we can change \(\mathbf{Y} \to \mathbf{A}\) meaning Complete Answer and \(a_t\) is the \(t^{th}\) generated token:

\[\begin{align*} P( \textcolor{SeaGreen}{\mathbf{Y}} | \textcolor{Plum}{\mathcal{T} }, \textcolor{Plum}{\mathcal{Q}}) & = \prod_{t=1}^{|\mathbf{Y} |} p_{\text{LM}}(y_t | y_{1:t-1}, \textcolor{Plum}{\mathcal{T} }, \textcolor{Plum}{\mathcal{Q}}) \\ P( \textcolor{SeaGreen}{\mathbf{A}} | \textcolor{Plum}{\mathcal{T} }, \textcolor{Plum}{\mathcal{Q}}) & = \prod_{t=1}^{|\textcolor{SeaGreen}{\mathbf{A}} |} p_{\text{LM}}(\textcolor{SeaGreen}{a}_{t} | \textcolor{SeaGreen}{a}_{1:t-1}, \textcolor{Plum}{\mathcal{T} }, \textcolor{Plum}{\mathcal{Q}}) \\ \end{align*}\]

Note: The calculations stop when an end-of-sequence token (i.e. <EOS> or <|endoftext|>) is generated.

Prompt Templating

Prompt Templating

We can generalize the equation even more by thinking of the prompt (\(\mathcal{T}\)) as a function or a template that takes in different questions/queries (\(\mathcal{Q} \to x^{*}\)). We can represent the prompt template as \(\mathcal{T(\cdot)}\) and when a question/query is plugged in we get \(\mathcal{T(x^{*})}\). The equation then becomes:

\[\begin{align*} P( \textcolor{SeaGreen}{\mathbf{A}} | \textcolor{Plum}{\mathcal{T(x^{*})} } ) & = \prod_{t=1}^{|\textcolor{SeaGreen}{\mathbf{A}} |} p_{\text{LM}}(\textcolor{SeaGreen}{a}_{t} | \textcolor{SeaGreen}{a}_{1:t-1}, \textcolor{Plum}{\mathcal{T(x^{*})} }) \\ \end{align*}\]

For example:

  • \(T(\cdot)\) - The template could be:
    What is the capital of {ENTER COUNTY}?
  • \(x^{*}\) - The query or question would be:
    France
  • \(\mathcal{T(x^{*})}\) - The full input would be:
    What is the capital of France?
  • \(\textcolor{SeaGreen}{\mathbf{A}}\) - The returned answer would be:
    Paris

Few-Shot Prompting

Few-Shot Prompting

With few-shot prompting we can think of it as just supplying the template \(\mathcal{T(\cdot)}\) with another input which are examples to use to do the task. We can represent the set of examples as \(\phi\) where:

\[\begin{align*} \phi = \{ (z_{i}, y_{i} ) \}_{i=1}^{n} \end{align*}\]

In this case:

  • \(z_i\) is just example \(i\) that we’re passing to the LLM with our query
  • \(y_i\) is the corresponding answer to the \(i^{th}\) example.

For example, for the task of getting country capitals, a 5-shot prompt (\(n\) is 5 examples, \(z_i\) - country and \(y_i\) - the country’s capital) would look like the following where \(\phi\) is: \[\begin{align*} \phi = \{ (z_1 = Algeria, y_1 = Algiers)_{1}, \\ (z_2 =France, y_2 = Paris)_{2}, \\ (z_3 =Spain, y_3 = Madrid)_{3}, \\ (z_4 =Norway, y_4 = Oslo)_{4}, \\ (z_5 =Chile, y_5 = Santiago)_{5}, \\ \} \end{align*}\]

The few-shot prompted equation then looks like:

\[\begin{align*} P( \textcolor{SeaGreen}{\mathbf{A}} | \textcolor{Plum}{\mathcal{T(\phi, x^{*})} } ) & = \prod_{t=1}^{|\textcolor{SeaGreen}{\mathbf{A}} |} p_{\text{LM}}(\textcolor{SeaGreen}{a}_{t} | \textcolor{SeaGreen}{a}_{1:t-1}, \textcolor{Plum}{\mathcal{T(\phi, x^{*})} }) \\ \end{align*}\]

For example the full input prompt would then be:

  • \(\mathcal{T(\cdot)}\):
    What is the capital of {ENTER COUNTRY}?
  • \(\phi\):
    (Algeria, Algiers)
    (France, Paris)
    (Spain, Madrid)
    (Norway, Oslo)
    (Chile, Santiago)
  • \(x^{*}\):
    Germany
  • \({\mathcal{T(\phi, x^{*})} }\):
    What is the capital of Algeria?
    Algiers
    What is the capital of France?
    Paris
    What is the capital of Spain?
    Madrid
    What is the capital of Norway?
    Oslo
    What is the capital of Chile?
    Santiago
    What is the capital of Germany?
This full prompt (\({\mathcal{T(\phi, x^{*})} }\)) would then be passed to the LLM which should return the response (\(\textcolor{SeaGreen}{\mathbf{A}}\)) Berlin.

What exactly is AWS Bedrock?

AWS Bedrock

AWS Bedrock (Amazon Web Services):

  • Is an Amazon product that provides a platform for leveraging Foundation Models (FM)/LLMs.
  • It gives access to a number of LLMs, all on one platform.
  • The models can be accessed through a central API (Application Programming Interface).

Bedrock abstracts away infrastructure complexities like compute and GPU provisioning, enabling easier development and experimentation with various models. Currently the Bedrock platform hosts models from Anthropic, Meta, Cohere, Mistral, AI21 etc.

AWS - Getting Started

Getting started

To begin using AWS Bedrock:

  1. Create an AWS account.
  2. Create an IAM user and obtain access keys (AWS Secret Access Key ID and AWS Secret Access Key - see here).
    • These keys are used for API authentication and should be stored securely as environment secrets or in a .env file.
  3. Request model access in the Bedrock console.
    • On Bedrock we can go to Model access then request the models we want to work with. This will take a few minutes for access to be granted.
  4. Install the boto3 Python library for API interaction with Bedrock
    • Bedrock uses the boto3 Python library to set up information transfer to and from AWS.

AWS - Setting up Bedrock

Setting up Bedrock

Once we have all our credentials in order we set up our connection to AWS. To establish a connection to AWS we import the:

  • boto3 library to set up our runtime connection
  • os library to get our environment secrets (API ID and key)

Import libraries:

import boto3
import os

Set up our AWS Bedrock access:

# Configure AWS credentials
bedrock_runtime = boto3.client(
    service_name='bedrock-runtime',
    region_name=os.environ['AWS_DEFAULT_REGION'], #Region where the Bedrock service is running e.g. us-east-1
    aws_access_key_id=os.environ['AWS_ACCESS_KEY_ID'], #AWS Access Key ID
    aws_secret_access_key=os.environ['AWS_SECRET_ACCESS_KEY'] #AWS Secret Access Key
)

bedrock_runtime
<botocore.client.BedrockRuntime at 0x78c610c21750>

AWS - LLM API Templates

LLM API Templates

Bedrock’s API required model-specific request formats (in some cases it still does), varying significantly between models like Llama 3.3 70B Instruct and Claude V2.

Example Request Differences:

  • Llama 3.3 70B Instruct:

    {
    "modelId": "meta.llama3-3-70b-instruct-v1:0",
    "contentType": "application/json",
    "accept": "application/json",
    "body": "{\"prompt\":\"this is where you place your input text\",\"max_gen_len\":512,\"temperature\":0.5,\"top_p\":0.9}"
    }
  • Anthropic Claude V2:

    {
      "modelId": "anthropic.claude-v2",
      "contentType": "application/json",
      "accept": "*/*",
      "body": "{\"prompt\":\"\\n\\nHuman: Hello world\\n\\nAssistant:\",\"max_tokens_to_sample\":300,\"temperature\":0.5,\"top_k\":250,\"top_p\":1,\"stop_sequences\":[\"\\n\\nHuman:\"],\"anthropic_version\":\"bedrock-2023-05-31\"}"
    }

To streamline model interaction, Bedrock introduced the Converse API, providing a unified format. Now, you only need to specify:

  • modelId: The model identifier.
  • prompt: The input text.
  • model parameters: Settings like max_tokens_to_sample and temperature.

AWS - Example

Scenario: We want to ask an LLM about the symptoms of pneumonia using Anthropic’s Claude model on AWS Bedrock.

  1. Define the User’s Question:
    First, we create the user’s question, which we’ll call the user_message or query.

    # Start a conversation with the user message.
    user_message= "What are the symptoms of pneumonia?"
  2. Specify the Model:
    Next, we choose the LLM we want to use. In this case, we’ll use Anthropic’s Claude 3 Haiku.

    # Set the model ID, e.g., Claude 3 Haiku.
    model_id = "anthropic.claude-3-haiku-20240307-v1:0"
  3. Prepare the Conversation Format:
    The Bedrock API expects the input in a specific “conversation” format, where each message has a role (either “user” or “assistant”) and content.

    conversation = [
        {
            "role": "user",
            "content": [{"text": user_message}],
        }
    ]
  4. Define Inference Configuration:
    We define the inference_config for the model, which includes parameters like maximum tokens and temperature.

    inference_config = {
        "maxTokens": 512,
        "temperature": 0.5,
        "topP": 0.9,
    }
  5. Send the Request and Receive the Response:
    We use the bedrock_runtime.converse function to send the conversation and inference_config to the specified model_id.

    bedrock_runtime = boto3.client(service_name="bedrock-runtime", 
                                    region_name="us-east-1")
    
    response =  bedrock_runtime.converse(
        modelId=model_id,
        messages=conversation,
        inferenceConfig=inference_config
        )
  6. Extract the Response Text:
    We extract the model’s response text from the JSON response.

    response_text = response["output"]["message"]["content"][0]["text"]
    print(response_text)
  7. Encapsulate in a Function:
    We can create a function invoke_model to encapsulate these steps, making it easier to reuse (see code in ‘Putting it all together’ section). We’ll display the User Query and LLM Response in chat bubbles so it’s easier to follow the exchange of information.

    invoke_model

    
     #Define a helper function to call the model and get a response
     def invoke_model(model_id, user_message, conversation=None, inference_config=None, verbose=False):
         # # Set the model ID, e.g., Claude 3 Haiku.
         # model_id = "anthropic.claude-3-haiku-20240307-v1:0"
    
         # # Start a conversation with the user message.
         # user_message = "Describe the python language."
    
         if conversation is None:
             conversation = [
                 {
                     "role": "user",
                     "content": [{"text": user_message}],
                 }
             ]
    
         # Set the inference configuration.
         if inference_config is None:
             inference_config = {
                 "maxTokens": 512,
                 "temperature": 0.5,
                 "topP": 0.9,
             }
    
         try:
             # Send the message to the model, using a basic inference configuration.
             response =  client.converse(
                 modelId=model_id,
                 messages=conversation,
                 inferenceConfig=inference_config
             )
    
             # Extract and print the response text.
             response_text = response["output"]["message"]["content"][0]["text"]
             if verbose:
                 print(f"Request: {user_message}")
                 print(f"Response: {response_text}")
    
             return response_text
    
         except (Exception) as e:
             print(f"ERROR: Can't invoke '{model_id}'. Reason: {e}")
             exit(1)
    
    
    
     # Start a conversation with the user message.
     user_message = user_query = "What are the symptoms of pneumonia?"
    
     #Pass the message to the model
     model_id = "anthropic.claude-3-haiku-20240307-v1:0"
     response = model_response = invoke_model(model_id, user_message)
    
    
    
     #Display the user query and model response
     display_chat_bubble(user_query, sender="user")
     display_chat_bubble(model_response, sender="model")
Chat Bubble
👤User Query

What are the symptoms of pneumonia?

Chat Bubble
🩺 LLM Response

The main symptoms of pneumonia include:

  • Cough - This is often the first symptom and may produce mucus or phlegm. The cough may be dry at first.
  • Fever and chills - Pneumonia often causes a high fever, sometimes over 102°F (39°C).
  • Difficulty breathing - Breathing may be faster and shallower than normal. Patients may feel short of breath, especially with activity.
  • Chest pain - There is often a sharp or stabbing chest pain, especially when breathing deeply or coughing.
  • Fatigue - Pneumonia can cause significant tiredness and weakness.

Other potential symptoms include:

  • Nausea and vomiting
  • Diarrhea
  • Confusion, especially in older adults

The specific symptoms can vary depending on the type of pneumonia, the causative agent, and the patient's age and overall health. Symptoms are usually more severe in the elderly, young children, and those with underlying medical conditions.

Seeking medical attention is important, as pneumonia can be a serious and potentially life-threatening condition that requires prompt treatment.

What exactly is Clinical Data?

Clinical Data

Clinical or medical data encompasses a collection of health information generated through patients’ interaction with healthcare systems. This can include:

  • Electronic Health Records (EHRs) containing patient demographics, medical history, diagnoses, medications, treatment plans, laboratory results, and imaging reports
  • Clinical trial data and research findings
  • Claims and billing information
  • Data from wearable devices and remote monitoring systems

Clinical data is particularly complex because it combines structured data (like tabular diagnostic codes and lab values) with unstructured data (such as clinical notes, radiology reports, and patient narratives). This complexity makes it both challenging and valuable for analytical purposes.

This section demonstrates clinical data analysis using AWS Bedrock. We’ll explore how to extract insights from un/structured clinical data using advanced prompt engineering.

We’ll cover these clinical analytics use cases:

  • Clinical Narrative Summarization: Few-shot prompting for concise medical summaries.
  • Diagnosis Generation: Chain-of-thought prompting for preliminary diagnoses.
  • Clinical Classification: Zero-shot prompting for data classification.
  • Structured Data Extraction: XML-based prompting for targeted information retrieval.

Clinical Narrative Summarization

LLM word generation
LLM workflow for clinical narrative summarization.

Our goal is to get the model to summarize a clinical report.

Here we’ll use a few-shot prompt to get the model to understand the task by showing it examples of the task. The task will be to summarize some medical text into:

  • Key Findings
  • Recommendations

We will provide the model with these examples:

  • Example 1:

INPUT:
MRI of the right knee demonstrates a complex tear of the posterior horn and body of the medial meniscus. There is moderate joint effusion and mild chondromalacia of the patellofemoral compartment. The anterior and posterior cruciate ligaments are intact.

OUTPUT:
Key Findings: Complex tear of medial meniscus (posterior horn and body); Moderate joint effusion; Mild patellofemoral chondromalacia
Recommendations: Orthopedic consultation; Consider arthroscopic evaluation and treatment

  • Example 2:

INPUT:
Chest X-ray shows clear lung fields bilaterally with no evidence of consolidation, effusion, or pneumothorax. The heart size is within normal limits. No acute cardiopulmonary process identified.

OUTPUT:
Key Findings: Clear lung fields; Normal heart size; No acute cardiopulmonary process
Recommendations: No further imaging needed at this time

  • Model tasked to summarize text:

INPUT:
Patient presents with a 3-day history of high fever (103°F), chills, productive cough with yellow sputum, and pleuritic chest pain exacerbated by deep breathing. Physical exam reveals decreased breath sounds and crackles in the right lower lung field. ECG shows sinus tachycardia, but is otherwise unremarkable. Chest X-ray demonstrates consolidation in the right lower lobe, and a small right-sided pleural effusion. White blood cell count is elevated.

OUTPUT:

What do we expect?

The model should learn from the examples to extract ailment-related information (e.g., elevated WBC, high fever) and provide appropriate treatment recommendations.

For example an output like below would be consistent with what we expect:

OUTPUT:
Key Findings: Right lower lobe consolidation; Right-sided pleural effusion; High fever; Decreased breath sounds and crackles; Elevated WBC.
Recommendations: Initiate broad-spectrum antibiotics for bacterial pneumonia; Consider sputum cultures; Monitor oxygen saturation; Pain management; Follow-up chest X-ray to assess resolution.

Model Output

Code
# Few-shot learning example for medical report summarization

# Define the examples and query
example_1 = """
INPUT: 
MRI of the right knee demonstrates a complex tear of the posterior horn and body of the medial meniscus. 
There is moderate joint effusion and mild chondromalacia of the patellofemoral compartment. 
The anterior and posterior cruciate ligaments are intact.
OUTPUT:
**Key Findings:** Complex tear of medial meniscus (posterior horn and body); Moderate joint effusion; Mild patellofemoral chondromalacia
**Recommendations:** Orthopedic consultation; Consider arthroscopic evaluation and treatment
"""
example_2 = """
INPUT:
Chest X-ray shows clear lung fields bilaterally with no evidence of consolidation, effusion, or pneumothorax. 
The heart size is within normal limits. 
No acute cardiopulmonary process identified.
OUTPUT:
**Key Findings:** Clear lung fields; Normal heart size; No acute cardiopulmonary process
**Recommendations:** No further imaging needed at this time
"""

query = """
INPUT:
Patient presents with a 3-day history of high fever (103°F), chills, productive cough with yellow sputum, and pleuritic chest pain exacerbated by deep breathing. 
Physical exam reveals decreased breath sounds and crackles in the right lower lung field. 
ECG shows sinus tachycardia, but is otherwise unremarkable. 
Chest X-ray demonstrates consolidation in the right lower lobe, and a small right-sided pleural effusion. White blood cell count is elevated.
"""

# Combine the examples and query into a single prompt
# We'll use f-strings to insert the content from the examples and query into the full prompt.
few_shot_prompt = f"""
Task: Summarize the following medical report into key findings and recommendations.
Follow  the format of the examples strictly. Make the summary concise and actionable.
The key findings should be short and to the point.
The key findings should be less than 5 words.
The recommendations should be clear and specific.
The summary should be in the following format:
**Key Findings:** [colon-separated list of key findings]
**Recommendations:** [colon-separated list of recommendations]

{example_1}
{example_2}
Now summarize this medical report:
{query}
"""

print(few_shot_prompt)
Code
#Pass the few-shot prompt to the model
model_id = "anthropic.claude-3-haiku-20240307-v1:0"
response = invoke_model(model_id, few_shot_prompt)
Code
#Display the user query and model response
display_chat_bubble(user_query, sender="user")
display_chat_bubble(model_response, sender="model")
Chat Bubble
👤User Query

Task: Summarize the following medical report into key findings and recommendations. Follow the format of the examples strictly. Make the summary concise and actionable. The key findings should be short and to the point. The key findings should be less than 5 words. The recommendations should be clear and specific. The summary should be in the following format: Key Findings: [colon-separated list of key findings] Recommendations: [colon-separated list of recommendations]

INPUT: MRI of the right knee demonstrates a complex tear of the posterior horn and body of the medial meniscus. There is moderate joint effusion and mild chondromalacia of the patellofemoral compartment. The anterior and posterior cruciate ligaments are intact. OUTPUT: Key Findings: Complex tear of medial meniscus (posterior horn and body); Moderate joint effusion; Mild patellofemoral chondromalacia Recommendations: Orthopedic consultation; Consider arthroscopic evaluation and treatment

INPUT: Chest X-ray shows clear lung fields bilaterally with no evidence of consolidation, effusion, or pneumothorax. The heart size is within normal limits. No acute cardiopulmonary process identified. OUTPUT: Key Findings: Clear lung fields; Normal heart size; No acute cardiopulmonary process Recommendations: No further imaging needed at this time

Now summarize this medical report:

INPUT: Patient presents with a 3-day history of high fever (103°F), chills, productive cough with yellow sputum, and pleuritic chest pain exacerbated by deep breathing. Physical exam reveals decreased breath sounds and crackles in the right lower lung field. ECG shows sinus tachycardia, but is otherwise unremarkable. Chest X-ray demonstrates consolidation in the right lower lobe, and a small right-sided pleural effusion. White blood cell count is elevated.

Chat Bubble
🩺 LLM Response

Key Findings: High fever; Productive cough; Right lower lobe consolidation; Pleural effusion; Elevated WBC Recommendations: Initiate antibiotic therapy; Consider chest CT scan; Monitor for clinical improvement

What do we see?

The model’s output aligned with the prompt, providing contextually accurate Key Findings and Recommendations for the pneumonia scenario, and respecting the defined response parameters.

Diagnosis Generation

LLM word generation
LLM workflow for diagnosis generation.

Our goal is to provide a preliminary medical diagnosis of a patient based on their symptoms.

Here we’ll use a chain-of-thought prompting to guide the model through a systematic reasoning process.

We will define the following steps for the model to follow to arrive at the answer:

  • Step 1: List potential diagnoses that could explain these symptoms.
  • Step 2: For each potential diagnosis, evaluate the likelihood based on the symptom presentation.
  • Step 3: Identify what additional tests or information would be most valuable for differential diagnosis.
  • Step 4: Recommend a preliminary diagnosis and next steps.

Data provided to the model:

Patient’s symptoms:

  • Persistent cough for 3 weeks, after returning from a trip to a humid tropical country
  • Chest pain while breathing deeply
  • Low-grade fever (99.5°F)
  • Fatigue and decreased appetite

What do we expect?

The given symptoms are consistent with pneumonia (although it overlaps with other conditions). We would expect the model to consider pneumonia as a potential diagnosis and provide treatment recommendations consistent with pneumonia or other similar conditions. We’ll look at Step 2 to verify that pneumonia was considered in coming to a potential diagnosis.

Model Output

Code
# Chain-of-Thought Prompting example for medical diagnosis

symptoms = """
- Persistent cough for 3 weeks, after returning from a trip to a humid tropical country.
- Chest pain while breathing deeply
- Low-grade fever (99.5°F)
- Fatigue and decreased appetite
"""


cot_prompt = f"""
You are a medical expert. You are given the following patient's symptoms.
{symptoms}

Follow these steps to get to a preliminary diagnosis and recommendations:
Step 1: List potential diagnoses that could explain these symptoms.
Step 2: For each potential diagnosis, evaluate the likelihood based on the symptom presentation.
Step 3: Identify what additional tests or information would be most valuable for differential diagnosis.
Step 4: Recommend a preliminary diagnosis and next steps.

Your response should be in the following format:   
**Step 1:** [potential diagnoses]
**Step 2:** [likelihood for each diagnosis]
**Step 3:** [brief list of tests or information]
**Step 4:** [preliminary diagnosis]

**Diagnosis:** [diagnosis]
**Likelihood:** [likelihood]
**Next steps:** [next steps]
"""
print(cot_prompt)
Code
#Pass the CoT prompt to the model
model_id = "anthropic.claude-3-haiku-20240307-v1:0"
response = invoke_model(model_id, cot_prompt)
Code
#Display the user query and model response
display_chat_bubble(user_query, sender="user")
display_chat_bubble(model_response, sender="model")
Chat Bubble
👤User Query

You are a medical expert. You are given the following patient's symptoms.

  • Persistent cough for 3 weeks, after returning from a trip to a humid tropical country.
  • Chest pain while breathing deeply
  • Low-grade fever (99.5°F)
  • Fatigue and decreased appetite

Follow these steps to get to a preliminary diagnosis and recommendations: Step 1: List potential diagnoses that could explain these symptoms. Step 2: For each potential diagnosis, evaluate the likelihood based on the symptom presentation. Step 3: Identify what additional tests or information would be most valuable for differential diagnosis. Step 4: Recommend a preliminary diagnosis and next steps.

Your response should be in the following format:
Step 1: [potential diagnoses] Step 2: [likelihood for each diagnosis] Step 3: [brief list of tests or information] Step 4: [preliminary diagnosis]

Diagnosis: [diagnosis] Likelihood: [likelihood] Next steps: [next steps]

Chat Bubble
🩺 LLM Response

As a medical expert, I will follow the steps you provided to arrive at a preliminary diagnosis and recommendations for the given patient's symptoms.

Step 1: Potential Diagnoses - Pneumonia - Tuberculosis - Fungal lung infection (e.g., histoplasmosis, coccidioidomycosis) - Lung cancer - Pulmonary embolism

Step 2: Likelihood of Potential Diagnoses - Pneumonia: High likelihood, as the patient has a persistent cough, chest pain, and low-grade fever, which are common symptoms of pneumonia. - Tuberculosis: Moderate likelihood, as the patient has recently returned from a humid tropical country, which is a risk factor for tuberculosis. - Fungal lung infection: Moderate likelihood, as the patient has recently returned from a humid tropical country, which is a risk factor for certain fungal infections. - Lung cancer: Low likelihood, as the patient's symptoms are more indicative of an infectious or inflammatory process. - Pulmonary embolism: Low likelihood, as the patient does not have any known risk factors for this condition.

Step 3: Additional Tests or Information - Chest X-ray or CT scan to evaluate the lungs and identify any abnormalities - Sputum culture or acid-fast bacilli (AFB) test to check for bacterial or mycobacterial infections - Blood tests, including complete blood count (CBC) and inflammatory markers (e.g., C-reactive protein, erythrocyte sedimentation rate) - Travel history and potential exposures during the trip to the humid tropical country

Step 4: Preliminary Diagnosis and Next Steps Preliminary Diagnosis: Pneumonia Likelihood: High

Next Steps: 1. Order a chest X-ray or CT scan to assess the lungs and identify any signs of pneumonia or other pulmonary abnormalities. 2. Obtain sputum samples for culture and AFB testing to rule out bacterial or mycobacterial infections, such as tuberculosis. 3. Perform blood tests, including a CBC and inflammatory markers, to support the diagnosis of pneumonia and monitor the patient's response to treatment. 4. Gather more information about the patient's recent travel history and potential exposures during the trip to the humid tropical country

What do we see?

We see that the model was able to follow the steps we defined in order to get to a preliminary diagnosis and provide some next steps. Based on the symptoms the model identified that the patient has a high likelihood of having pneumonia as we expected.

Clinical Classification

LLM word generation
LLM workflow for clinical classification.

Our goal is to classify medical data into different categories using zero-shot prompting. Here we’ll pass the model a prompt with the category labels and the data to be classified.

Here’s an example of what we’ll be passing to the model:

medical_text_1 = “Patient presents with a fever of 103.5 degrees Fahrenheit, accompanied by severe chills, rigors, body aches, and a persistent throbbing headache. The patient reports feeling disoriented, has a flushed face, and exhibits petechiae on the chest and extremities. White blood cell count is elevated. Patient reports recent lower backpain on their left side and some blood in their urine.”

categories_1 = [“Infection”, “Heat Stroke”, “Drug Reaction”]

In this case the classification should be Infection based on the information provided.

What do we expect?

In each case we would expect that the model would recognize the classic symptoms presented in the clinical information. From there it should be able to link that information to one of the three classes provided, selecting the most likely class.

Model Output

Code
# Zero-shot learning example for medical classification

# Example 1: Fever with More Symptoms
medical_text_1 = """
Patient presents with a fever of 102.5 degrees Fahrenheit, accompanied by severe chills, body aches,
and a persistent headache. The patient reports feeling disoriented and has a flushed face.
"""
medical_text_1 =" ".join(medical_text_1.splitlines())
categories_1 = ["Infection", "Heat Stroke", "Drug Reaction"]


# Example 2: Broken Bone with Injury Details
medical_text_2 = """
Patient sustained a fall from a ladder, with immediate onset of severe pain and swelling in the lower leg. 
X-ray shows a complete, displaced fracture of the tibia, with evidence of soft tissue injury.
"""
medical_text_2 =" ".join(medical_text_2.splitlines())
categories_2 = ["Fracture", "Sprain", "Muscle Strain"]


# Example 3: Fever, Cough, Nasal Congestion Symptoms
medical_text_3 = """
Patient presents with a sudden onset of high fever (102-104°F), accompanied by severe chills, 
intense muscle aches (myalgia), and a persistent, dry cough. The patient reports a severe headache, 
significant fatigue, and a sore throat. Nasal congestion is minimal. 
The patient reports that many people in their workplace have been experiencing similar symptoms recently.
"""
medical_text_3 =" ".join(medical_text_3.splitlines())
categories_3 = ["Common Cold", "Flu", "Allergies"]


def zero_shot_classification_prompt(text, categories):
    import textwrap

    # Combine the examples into a single prompt
    prompt = (
    f"Task: Classify the following medical cases into the correct category.\n"
    f"Choose the most appropriate category from the list provided: {', '.join(categories)}\n"
    f"\n"
    f"Text: {text}\n"
    f"\n"
    f"Response format:\n"
    f"**Category:** [selected category]\n"
    f"**Confidence:** [high/medium/low]\n"
    f"**Reasoning:** [brief explanation]"
)
    return prompt

zero_shot_prompt = zero_shot_classification_prompt(text = medical_text_1, categories = categories_1)
print(zero_shot_prompt)

● Example 1

Code
#Pass the zero-shot prompt to the model
model_id = "anthropic.claude-3-haiku-20240307-v1:0"
zero_shot_prompt = zero_shot_classification_prompt(text = medical_text_1, categories = categories_1)
response = invoke_model(model_id, zero_shot_prompt)
Code
#Display the user query and model response
display_chat_bubble(user_query, sender="user")
display_chat_bubble(model_response, sender="model")
Chat Bubble
👤User Query

Task: Classify the following medical cases into the correct category. Choose the most appropriate category from the list provided: Infection, Heat Stroke, Drug Reaction

Text: Patient presents with a fever of 102.5 degrees Fahrenheit, accompanied by severe chills, body aches, and a persistent headache. The patient reports feeling disoriented and has a flushed face.

Response format: Category: [selected category] Confidence: [high/medium/low] Reasoning: [brief explanation]

Chat Bubble
🩺 LLM Response

Category: Infection Confidence: High Reasoning: The patient's symptoms, including fever, chills, body aches, and headache, are typical of an infection. The disorientation and flushed face could also be indicative of an underlying infection, rather than heat stroke or a drug reaction.

● Example 2

Code
#Pass the zero-shot prompt to the model
model_id = "anthropic.claude-3-haiku-20240307-v1:0"
zero_shot_prompt = zero_shot_classification_prompt(text = medical_text_2, categories = categories_2)
response = invoke_model(model_id, zero_shot_prompt)
Code
#Display the user query and model response
display_chat_bubble(user_query, sender="user")
display_chat_bubble(model_response, sender="model")
Chat Bubble
👤User Query

Task: Classify the following medical cases into the correct category. Choose the most appropriate category from the list provided: Fracture, Sprain, Muscle Strain

Text: Patient sustained a fall from a ladder, with immediate onset of severe pain and swelling in the lower leg. X-ray shows a complete, displaced fracture of the tibia, with evidence of soft tissue injury.

Response format: Category: [selected category] Confidence: [high/medium/low] Reasoning: [brief explanation]

Chat Bubble
🩺 LLM Response

Category: Fracture Confidence: High Reasoning: The given description indicates that the patient sustained a fall from a ladder, resulting in severe pain and swelling in the lower leg. The X-ray findings show a complete, displaced fracture of the tibia, which is a clear indication of a fracture. The presence of soft tissue injury further supports the diagnosis of a fracture.

● Example 3

Code
#Pass the zero-shot prompt to the model
model_id = "anthropic.claude-3-haiku-20240307-v1:0"
zero_shot_prompt = zero_shot_classification_prompt(text = medical_text_3, categories = categories_3)
response = invoke_model(model_id, zero_shot_prompt)
Code
#Display the user query and model response
display_chat_bubble(user_query, sender="user")
display_chat_bubble(model_response, sender="model")
Chat Bubble
👤User Query

Task: Classify the following medical cases into the correct category. Choose the most appropriate category from the list provided: Common Cold, Flu, Allergies

Text: Patient presents with a sudden onset of high fever (102-104°F), accompanied by severe chills, intense muscle aches (myalgia), and a persistent, dry cough. The patient reports a severe headache, significant fatigue, and a sore throat. Nasal congestion is minimal. The patient reports that many people in their workplace have been experiencing similar symptoms recently.

Response format: Category: [selected category] Confidence: [high/medium/low] Reasoning: [brief explanation]

Chat Bubble
🩺 LLM Response

Category: Flu Confidence: High Reasoning: The patient's symptoms, including high fever, severe chills, intense muscle aches, persistent dry cough, headache, fatigue, and sore throat, are characteristic of the flu (influenza) rather than a common cold or allergies. The sudden onset of these symptoms and the report of similar cases in the patient's workplace further support the diagnosis of the flu.

What do we see?

As anticipated, the model accurately classified symptoms across all cases, demonstrating its ability to recognize clinical indications from diverse domains, including:

  • Infectious disease (Example 1)
  • Orthopedic trauma (Example 2)
  • Viral illness (Example 3)

Structured Data Extraction

LLM word generation
LLM workflow for structured data extraction.

Our goal is to use the LLM to extract specific information from medical data, guided by XML tags.

XML’s (Extensible Markup Language) structured format is key here, making it easier for the LLM to understand the data’s organization. We use tags like <instructions> and <example> to provide direct guidance and illustrative examples. We’ll use XML tags along with one-shot learning to show the LLMs how to extract and format clinical data to be returned as an output.

For example we can use XML tags to indicate where each part of the prompt begins and ends. The example below defines 3 main parts:

  1. The <instruction> or task that we want the model to perform.
  2. The input data that we want the model to use, in this case a <clinical_note>.
  3. The format that we want returned for the extracted data. Defined as <output_schema>.

<instructions>
Extract the patient’s name, age, and diagnosis from the following clinical note.
</instructions>

<clinical_note>
Patient: John Smith, 67 years old, presented with symptoms of acute myocardial infarction.
</clinical_note>

<output_schema>
<patient_name> </patient_name>
<patient_age> </patient_age>
<diagnosis> </diagnosis>
</output_schema>

What do we expect?

We expect that the model would be able to accurately follow the instructions given, effectively analyze the input data, correctly extract and format the data given the formatting guide.

For example, for the given input above we would expect the model to return:

<output_schema>
<patient_name> John Smith </patient_name>
<patient_age> 67 </patient_age>
<diagnosis> Acute Myocardial Infarction </diagnosis>
</output_schema>

Model Output

Code
clinical_note = """
Todays date is March 12th, 2025.
Patient, Marty K. Martin, is a 62-year-old male with a history of hypertension, type 2 diabetes, and hyperlipidemia. 
Currently taking Lisinopril 20mg daily, Metformin 1000mg BID, and Atorvastatin 40mg at bedtime. 
Patient reports having a coronary artery bypass graft (CABG) 5 years ago.
Known allergies to penicillin (hives) and sulfa drugs (rash).
Vitals on admission: BP 142/88, HR 78, RR 16, Temp 98.6°F, SpO2 97% on room air.
Recent HbA1c was 7.2% down from 7.8%. Patient scheduled for routine colonoscopy next month. 
Patient celebrated his 62nd birthday on Tuesday, March 8th, 2025.
Medical record number: 87654321
"""
clinical_note =" ".join(clinical_note.splitlines())

xml_prompt = f"""
You're a medical records analyst at a Clinical Electronics Health Records company. 
Generate a comprehensive patient summary for our medical team.
Our hospital uses structured data formats for efficient information transfer between departments. 
The medical team needs clear, concise, and actionable patient information.
Use this clinical note for your summary:
<clinical_note>
{clinical_note}
</clinical_note>
<instructions>
1. Extract patient information including: Patient Name, MRN,  DOB (ONLY calculate DOB if enough information is given)
2. Extract and organize: Medications, Allergies, Vital Signs, and Recent Procedures
3. Highlight critical values and potential drug interactions
4. Format as a nested table structure for easy reference
</instructions>
Please include all relevant information and maintain the nested table structure in your summary.
Replace missing data with N/A.
Make your tone clinical and precise. 
Follow this structure exactly and only return the formatted output:
<formatting_example>
**PATIENT SUMMARY**
**Patient**: John Doe 
**MRN**: 12345678 
**DOB**: 01/15/1965
**Exam Date**: 06/12/2020
**MEDICATIONS**

| Medication | Dosage | Frequency | Route | Indication |
|------------|--------|-----------|-------|------------|
| Lisinopril | 20mg   | Daily     | PO    | HTN        |

**ALLERGIES**

| Allergen   | Reaction  | Severity  |
|------------|-----------|-----------|
| Penicillin | Urticaria | Moderate  |

**VITAL SIGNS** (Date: 03/15/2025)

| Parameter       | Value      | Status    |
|-----------------|------------|-----------|
| BP              | 132/78     | WNL       |
| Heart Rate      | 88 bpm     | WNL       |
| Temperature     | 99.1°F     | ELEVATED  |
| Oxygen Sat      | 97%        | WNL       |

**LAB VALUES**

| Test  | Value | Units | Date      |
|-------|-------|-------|-----------|
|Glucose| 70    | mg/dL | 12/12/2024|

**RECENT & SCHEDULED PROCEDURES**

| Procedure     | Date       | Provider | Key Findings |
|---------------|------------|----------|--------------|
| Heart Surgery | Next month | Dr. Smith| N/A          |

</formatting_example>
"""
Code
#Pass the few-shot prompt to the model
model_id = "anthropic.claude-3-haiku-20240307-v1:0"
response = invoke_model(model_id, xml_prompt)
Code
#Display the user query and model response
display_chat_bubble(user_query, sender="user")
display_chat_bubble(model_response, sender="model")
Chat Bubble
👤User Query

You're a medical records analyst at a Clinical Electronics Health Records company. Generate a comprehensive patient summary for our medical team. Our hospital uses structured data formats for efficient information transfer between departments. The medical team needs clear, concise, and actionable patient information. Use this clinical note for your summary: Todays date is March 12th, 2025. Patient, Marty K. Martin, is a 62-year-old male with a history of hypertension, type 2 diabetes, and hyperlipidemia. Currently taking Lisinopril 20mg daily, Metformin 1000mg BID, and Atorvastatin 40mg at bedtime. Patient reports having a coronary artery bypass graft (CABG) 5 years ago. Known allergies to penicillin (hives) and sulfa drugs (rash). Vitals on admission: BP 142/88, HR 78, RR 16, Temp 98.6°F, SpO2 97% on room air. Recent HbA1c was 7.2% down from 7.8%. Patient scheduled for routine colonoscopy next month. Patient celebrated his 62nd birthday on Tuesday, March 8th, 2025. Medical record number: 87654321 1. Extract patient information including: Patient Name, MRN, DOB (ONLY calculate DOB if enough information is given) 2. Extract and organize: Medications, Allergies, Vital Signs, and Recent Procedures 3. Highlight critical values and potential drug interactions 4. Format as a nested table structure for easy reference Please include all relevant information and maintain the nested table structure in your summary. Replace missing data with N/A. Make your tone clinical and precise. Follow this structure exactly and only return the formatted output: PATIENT SUMMARY Patient: John Doe MRN: 12345678 DOB: 01/15/1965 Exam Date: 06/12/2020 MEDICATIONS

Medication Dosage Frequency Route Indication
Lisinopril 20mg Daily PO HTN

ALLERGIES

Allergen Reaction Severity
Penicillin Urticaria Moderate

VITAL SIGNS (Date: 03/15/2025)

Parameter Value Status
BP 132/78 WNL
Heart Rate 88 bpm WNL
Temperature 99.1°F ELEVATED
Oxygen Sat 97% WNL

LAB VALUES

Test Value Units Date
Glucose 70 mg/dL 12/12/2024

RECENT & SCHEDULED PROCEDURES

Procedure Date Provider Key Findings
Heart Surgery Next month Dr. Smith N/A

Chat Bubble
🩺 LLM Response

PATIENT SUMMARY Patient: Marty K. Martin MRN: 87654321 DOB: 03/08/1963 Exam Date: 03/12/2025

MEDICATIONS

Medication Dosage Frequency Route Indication
Lisinopril 20mg Daily PO Hypertension
Metformin 1000mg BID PO Diabetes
Atorvastatin 40mg Bedtime PO Hyperlipidemia

ALLERGIES

Allergen Reaction Severity
Penicillin Hives Moderate
Sulfa Drugs Rash Moderate

VITAL SIGNS (Date: 03/12/2025)

Parameter Value Status
Blood Pressure 142/88 ELEVATED
Heart Rate 78 bpm WNL
Temperature 98.6°F WNL
Oxygen Saturation 97% WNL
Respiratory Rate 16 bpm WNL

LAB VALUES

Test Value Units Date
HbA1c 7.2 % 03/12/2025

RECENT & SCHEDULED PROCEDURES

Procedure Date Provider Key Findings
Coronary Artery Bypass Graft (CABG) 5 years ago N/A N/A
Colonoscopy Next month N/A N/A

What do we see?

We see that the model was able to successfully extract all the relevant information from the provided text. This was after several rounds of refining the input prompt. After extracting the relevant information, the model followed the formatting example and correctly formatted the output data.

There are a few interesting things to note in the model’s output:

  • Although the term MRN does not explicitly appear in the text, the model identified that MRN stood for Medical Record Number and pulled out the listed number.
  • The patient’s DOB was not explicitly mentioned but context clues were given to figure out the patient’s DOB like:
    • Patient, Marty K. Martin, is a 62-year-old
    • Patient celebrated his 62nd birthday on Tuesday, March 8th, 2025
  • The model was able to correctly match each of the medications to their indications:
    • Lisinopril –> Hypertension
    • Metformin –> Diabetes
    • Atorvastatin –> Hyperlipidemia
  • Within the vital signs table, the model correctly identified that the blood pressure reported was ELEVATED (typical 120/80)
  • The patient’s HBA1C was correctly extracted and the units were correctly identified
  • Both the past and upcoming procedures for the patient were correctly identified and extracted with an informative date assigned.

Putting it all together

Now that we have all the components that we need we can encapsulate it into a helpful UI (User Interface) and deploy it to a cloud provider like AWS EC2, GCP, Azure, or Hugging Face (HF).

Since Hugging Face is quick and easy to set up, we can create and use a Hugging Face Space to host our app. HF also hosts a number of open-source LLMs so we can switch between Bedrock models and HF in our app. We’ll need:

  • UI code like Streamlit with connections to make API calls to the models (like we did above using boto). The model outputs are passed to Streamlit to present it in a nicer format.
  • From there we can put our Streamlit file into our Hugging Face Space.
  • Then run our app, which will create an interface to interact with our models.

For example, this is a simple Streamlit app using AWS Bedrock saved as app.py:

Code
  • Import libraries

    #|||||||||||||||||||||||||||||||||||||||||||||||||
    #                  IMPORTS
    import streamlit as st
    import boto3
    import json
    import os
    from dotenv import load_dotenv, dotenv_values
    load_dotenv()
  • Set up prompt

    #|||||||||||||||||||||||||||||||||||||||||||||||||
    #                   PROMPT
    
    #==================================================
    #       Setting up Prompt
    #%% Clinical Narative Summarization
    
    # Define the examples and query
    cns_example_1 = """\
    **INPUT:**
    MRI of the right knee demonstrates a complex tear of the posterior horn and body of the medial meniscus. 
    There is moderate joint effusion and mild chondromalacia of the patellofemoral compartment. 
    The anterior and posterior cruciate ligaments are intact.
    \n**OUTPUT:**
    \n**Key Findings:** Complex tear of medial meniscus (posterior horn and body); Moderate joint effusion; Mild patellofemoral chondromalacia
    \n**Recommendations:** Orthopedic consultation; Consider arthroscopic evaluation and treatment
    """
    cns_example_2 = """\
    **INPUT:**
    Chest X-ray shows clear lung fields bilaterally with no evidence of consolidation, effusion, or pneumothorax. 
    The heart size is within normal limits. 
    No acute cardiopulmonary process identified.
    \n**OUTPUT:**
    \n**Key Findings:** Clear lung fields; Normal heart size; No acute cardiopulmonary process
    \n**Recommendations:** No further imaging needed at this time
    """
    
    cns_query = """\
    Patient presents with a 3-day history of high fever (103°F), chills, productive cough with yellow sputum, and pleuritic chest pain exacerbated by deep breathing. 
    Physical exam reveals decreased breath sounds and crackles in the right lower lung field. 
    ECG shows sinus tachycardia, but is otherwise unremarkable. 
    Chest X-ray demonstrates consolidation in the right lower lobe, and a small right-sided pleural effusion. White blood cell count is elevated.
    """
    cns_query =" ".join(cns_query.splitlines())
    
    # Combine the examples and query into a single prompt
    # We'll use f-strings to insert the content from the examples and query into the full prompt.
    cns_prompt = f"""\
    **Task:** Summarize the following medical report into key findings and recommendations.
    Follow  the format of the examples strictly. Make the summary concise and actionable.
    The the key findings should be short and to the point.
    The key findings should be less than 5 words.
    The recommendations should be clear and specific.
    The summary should be in the following format:
    \n**Key Findings:** [colon-separated list of key findings]
    \n**Recommendations:** [colon-separated list of recommendations]
    {cns_example_1}
    {cns_example_2}
    Now summarize this medical report:
    **INPUT:**
    """
    #..................................................
    
    
    #==================================================
    # Task Prompts
    tasks = {
        "Clinical Narrative Summarization": {
            "prompt": cns_prompt,
            "example_query": cns_query,
        },}
    
    def generate_prompts(task_name, query):
        if task_name == "Clinical Narrative Summarization":
            prompt = f"""\
    **Task:** Summarize the following medical report into key findings and recommendations.
    Follow  the format of the examples strictly. Make the summary concise and actionable.
    The key findings should be short and to the point.
    The key findings should be less than 5 words.
    The recommendations should be clear and specific.
    The summary should be in the following format:
    \n**Key Findings:** [colon-separated list of key findings]
    \n**Recommendations:** [colon-separated list of recommendations]
    {cns_example_1}
    {cns_example_2}
    Now summarize this medical report:
    **INPUT:**
    {query}
    """
            return prompt, query
    
    #_____________________________________________________
  • Set up AWS Bedrock

    #|||||||||||||||||||||||||||||||||||||||||||||||||
    #                   AWS Bedrock
    
    #==================================================
    #       Define AWS Helper function
    
    def invoke_model(model_id, user_message, conversation=None, inference_config=None, verbose=False):
        # # Set the model ID, e.g., Claude 3 Haiku.
        # model_id = "anthropic.claude-3-haiku-20240307-v1:0"
    
        # # Start a conversation with the user message.
        # user_message = "Describe the python language."
    
        if conversation is None:
            conversation = [
                {
                    "role": "user",
                    "content": [{"text": user_message}],
                }
            ]
    
        # Set the inference configuration.
        if inference_config is None:
            inference_config = {
                "maxTokens": 512,
                "temperature": 0.5,
                "topP": 0.9,
            }
    
        try:
            # Send the message to the model, using a basic inference configuration.
            response =  client.converse(
                modelId=model_id,
                messages=conversation,
                inferenceConfig=inference_config
            )
    
            # Extract and print the response text.
            response_text = response["output"]["message"]["content"][0]["text"]
            if verbose:
                print(f"Request: {user_message}")
                print(f"Response: {response_text}")
    
            return response_text
    
        except (Exception) as e:
            print(f"ERROR: Can't invoke '{model_id}'. Reason: {e}")
            exit(1)
    #..................................................
    
    
    #==================================================
    # Configure AWS credentials
    client = boto3.client(
        service_name='bedrock-runtime',
        region_name=os.environ['AWS_DEFAULT_REGION'], #Region where the Bedrock service is running e.g. us-east-1
        aws_access_key_id=os.environ['AWS_ACCESS_KEY_ID'], #AWS Access Key ID
        aws_secret_access_key=os.environ['AWS_SECRET_ACCESS_KEY'] #AWS Secret Access Key
    )
    
    
    model_links ={ "Claude-3-Haiku":"anthropic.claude-3-haiku-20240307-v1:0"}
    # Define the available models
    models =[key for key in model_links.keys()]
    model_url = model_links[model_name]
    
    
    #_____________________________________________________
  • Set up Streamlit

    #|||||||||||||||||||||||||||||||||||||||||||||||||
    #                   Streamlit
    
    # Streamlit App
    st.title("LLM Clinical Data Analysis")
    
    
    model_name = st.sidebar.selectbox("Select Model", models)
    task_name = st.selectbox("Select Task", list(tasks.keys()))
    query = st.text_area("Enter Query", value=tasks[task_name]["example_query"])
    
    model_url = model_links[model_name]
    
    
    with st.expander("Updated Prompt"):
        input_text, query = generate_prompts(task_name, query)
        st.write(str(input_text))
    
    if st.button("Analyze"):    
        result = invoke_model(model_url, input_text)
        st.write("Analysis Result:", result)

You can experiment with the live app here.

Extra: See how to create an LLM Streamlit UI here (Gebodh, 2024a).

LLM word generation
LLM clinical data analysis app switching from AWS Bedrock to HF hosted models and using Google’s Gemma 3 - 27B.

Take Away

We showcased the power of LLMs for clinical data analysis using AWS Bedrock, demonstrating efficient data extraction, summarization, and interpretation.

We explored:

  • Clinical narrative summarization
  • Diagnosis generation
  • Medical data classification
  • Structured data extraction

These examples highlighted how effective prompt engineering enables LLMs to perform complex medical tasks. However, these outputs are meant to:

  • Assist, not replace, medical professionals.
  • Be validated by clinical experts, adhering to all relevant regulations (HIPAA, GDPR, etc.).

When properly guided, LLMs can streamline workflows and improve data management, potentially leading to better patient outcomes.


Cite as

@misc{Gebodh2025BuildingLLMClinicalDataAnalyticsPipelineswithAWSBedrock,
  title = {Building LLM Clinical Data Analytics Pipelines with AWS Bedrock},
  author = {Nigel Gebodh},
  year = {2025},
  url = {https://ngebodh.github.io/projects/Short_dive_posts/aws/aws-analysis.html},
  note = {Published: April 11, 2025}
}
Back to top

References

Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., … Amodei, D. (2020). Language models are few-shot learners. https://arxiv.org/abs/2005.14165
Gebodh, N. (2024a). Large language models - chatting with AI chatbots from google, mistral AI, and hugging face. https://ngebodh.github.io/projects/2024-03-05/
Gebodh, N. (2024b). Why does my LLM have a temperature? https://ngebodh.github.io/projects/Short_dive_posts/LLM_temp/LLM_temp.html
Qiao, S., Ou, Y., Zhang, N., Chen, X., Yao, Y., Deng, S., Tan, C., Huang, F., & Chen, H. (2023). Reasoning with language model prompting: A survey. https://arxiv.org/abs/2212.09597
Schulhoff, S., Ilie, M., Balepur, N., Kahadze, K., Liu, A., Si, C., Li, Y., Gupta, A., Han, H., Schulhoff, S., Dulepet, P. S., Vidyadhara, S., Ki, D., Agrawal, S., Pham, C., Kroiz, G., Li, F., Tao, H., Srivastava, A., … Resnik, P. (2025). The prompt report: A systematic survey of prompt engineering techniques. https://arxiv.org/abs/2406.06608

© Nigel Gebodh, Made on Earth by a Human