Pages

Tuesday, February 25, 2025

Creating AI Agents with Small Language Models

Metallic blue humanoid robot acting as a secret agent

It is an understatement to say that generative artificial intelligence applications have evolved rapidly since the launch of ChatGPT over two years ago. What started as applications that could answer questions via a text-based conversation has evolved into rich applications that can generate multimedia content, i.e., images, video and audio, by interpreting user requirements via natural language.

Now, the next evolution of these applications is called "AI agents," which mimic the human ability to reason and contemplate requests, gather external information and consider known facts before responding. These agents represent the next frontier by transforming static, single-step systems into dynamic, multi-step, autonomous agents capable of reasoning and gathering information external to themselves.

How AI agents reason and act

Agentic AI reasoning (thinking) process.  Credit: HuggingFace
The Agentic AI thought (reasoning) process. Credit: HuggingFace

AI agents' artificial reasoning process is described as the "Think, Act and Observe" cycle.

  1. Think - the AI agent invokes a language model to consider the user request in the context of what it has learned from the current conversation with the end user. It decomposes the user request based on its current knowledge and resources to decide how to act on that request.
  2. Act - After the language model responds to the agent on the results of its analysis, the agent acts on the model's analysis. The agent's action can simply be responding to the end user, invoking an external tool on the language model's advice, and putting the tool's response into the conversation context.
  3. Observe - The agent returns the current conversation context to the language model for consideration. Then, the language model can decide to go through another reasoning cycle if it chooses to do so. It can also determine whether it can meet the end user's request with the information it already has in the conversation. If this happens, the agent stops the reasoning cycle and responds to the end user.

This iterative process allows the AI to refine its understanding and provide more accurate and relevant responses, enhancing user satisfaction. By continuously observing and analyzing, the agent can adapt to various user needs effectively.

Creating an AI agent on a laptop

There are various ways to create an AI agent. You will likely use the environments provided by the large hyperscalar providers (Google, Amazon, and Microsoft) to build production agentic AI applications for enterprise-based solutions. These ecosystems have a rich set of functions to support meeting enterprise application requirements.

But what if you want to learn how to create agents and observe their artificial reasoning process without the pre-requisite of learning a particular hyperscalar ecosystem?

Recent small language models from Meta Llama 3.2 and Mistral v0.3 support agentic AI reasoning and tool invocation. These models are small enough to start with the Ollama inference server running locally on a laptop. In my case, I have a modest Windows 11 laptop running an i3 Intel CPU with 8 GB of RAM that was sufficient for me to write and test my AI agent with Meta's Llama 3.1 "1b" model.

Straightforward AI agent implementation with a small language model, using a command line interface

If you are interested in the code, use this link to access my GitHub repository.

What I Learned from Creating an AI Agent

AI Agents Running on Small Devices Can Produce Valuable Outputs

Depending on requirements, it is possible to deploy lightweight AI agents with small language models to understand end-user requests and then return responses in chat conversations using natural languages.

Of course, there are inherent performance and quality trade-offs between running small language models on local hardware versus large language models hosted by hyperscalars. Deploying AI agents that utilize large language models on higher capacity hardware can yield richer responses but incur higher costs toward the hyperscalars providing those models.

AI Agents Running on Local Machines Ensure Data Privacy

Another trade-off between running agents that use locally versus remotely hosted language models is the consideration of data privacy. As with non-agentic AI applications, using open-source language models running on local compute resources removes the risk of sending user information to hyperscalars' remote data centers, thus ensuring a higher level of privacy in the chat conversation between the end users and AI applications.

Prompt Engineering Is Important to Meet Requirements

The language models need guidance to understand what external tools are available to them when they should be invoked, what information should be supplied to the tools and what information it can receive back from them. The AI agent can send a specially crafted system prompt to the language model when the inference server starts it to provide that level of guidance.

As with non-agentic AI applications, crafting detailed, specific prompts to guide AI agents and their language models is critical for meeting end-user expectations.

Conclusion

With the right technical skills and modest hardware, you can create functional AI agents that can analyze complex queries, fetch external data, and synthesize responses in your native language. This democratization of AI empowers you to experiment without relying solely on hyperscalar infrastructures. As we push the boundaries of what AI can achieve, building local, agentic applications unlocks new avenues for creativity, ensures data privacy and achieves resource efficiency.


 About Chris Vitalos

I leverage decades of expertise in the wireless telecommunications industry to provide advisory services to ThreatSciences.com, a consultant agency providing cybersecurity services and leading security advisors.

Outside work, I enjoy hiking, writing, and spending time with my family.


Sunday, January 19, 2025

Are your encrypted AWS S3 buckets secure from threat actors?

A humanoid robot with a silver and blue metallic exterior stands in front of a large, industrial safe. The robot has glowing blue eyes and detailed mechanical features. It is positioned to the left of the image, facing the safe. The safe has a dark gray metallic finish and a combination lock with visible dials and buttons. The background is a dark, blurred gray color.

You take proactive steps to secure your valuable business data by encrypting it at rest using private keys you manage and keep separate and off network. Yet the cybersecurity landscape constantly evolves, with threat actors developing new techniques to compromise systems, hold your data hostage, extort money from you and potentially put you out of business.

The Threat to Encrypted Cloud Storage

Until recently ransomware thieves encrypted local storage volumes or mounted file shares. Now hackers have discovered a novel threat to encrypted cloud storage. A new report from cyber resilience firm Halcyon identified a new campaign targeting Amazon Web Service’s (AWS) S3 cloud storage encrypted by Amazon’s Server-Side Encryption with Customer-Provided Keys (SSE-C).

This attack poses a significant recovery challenge as the attacker generates another set of SSE-C encryption keys. Because AWS doesn't store these customer-provided keys, recovering the data without the attacker's cooperation becomes virtually impossible.

How the Attack Works

  1. Credential Compromise: The attacker first gains valid AWS credentials with permissions to encrypt S3 buckets using SSE-C. This could be achieved through phishing scams, compromised systems, or exposed keys in code repositories.
  2. Access to Your Environment: The attacker leverages the stolen credentials to access your enterprise networks to discover the location of the existing S3 encryption keys.
  3. Unauthorized Key Rotation: The attacker decrypts the storage with your keys which they found on your network, then encrypt the storage with the keys they generated.
  4. Ransom Demand: A ransom note is left within the affected buckets, outlining the payment demands and threatening data deletion if the ransom is unpaid or the data is tampered with.
  5. Data Deletion Threat: To pressure victims, attackers may configure a lifecycle policy to automatically delete the encrypted data after a set timeframe, typically seven days.

Risk Assessment: Potential Impacts on Your Businesses

  • Data Loss: If the ransom is not paid, the encrypted data becomes permanently inaccessible.
  • Business Disruption: Critical data loss can cripple business operations, impacting productivity, customer service, and revenue.
  • Reputational Damage: Ransomware attacks can severely damage a company's reputation and erode customer trust.
  • Financial Losses: Businesses may incur financial losses from ransom payments, incident response, data recovery attempts, and legal/regulatory issues.
  • Operational Disruption: Loss of access to critical data can halt operations, leading to delays, missed deadlines, and contractual breaches.

Mitigation Strategies: Protecting Your S3 Buckets

  • Enforce Least Privilege: Grant AWS users and roles only the minimum permissions required for their tasks. Regularly review and audit IAM policies to ensure adherence to least privilege.
  • Employee Training: Train employees to identify and avoid phishing attempts and other social engineering tactics used to steal credentials, and best in class information security procedures on encryption key management.
  • Enable Multi-Factor Authentication (MFA): Enforce MFA for all AWS users, especially those with administrative privileges, to add an extra layer of security against compromised credentials.
  • Data Replication: Implement cross-region replication or backups to a separate AWS account for added data protection.
  • Use Short-Term Credentials: Avoid using long-term access keys. Instead, leverage IAM roles and AWS Security Token Service (STS) to generate temporary credentials for applications and services.
  • Restrict SSE-C Usage: If your applications don't require SSE-C, consider blocking its use through bucket policies or resource control policies (RCPs) within AWS Organizations.
  • Rotate Credentials Regularly: If using SSE-C, implement a regular rotation schedule for access keys and other credentials.
  • Monitor AWS Resources: Implement robust monitoring and logging using AWS CloudTrail and S3 server access logs. Set up alerts for suspicious activity, such as unusual API calls or bulk encryption operations.

How ThreatSciences.com Can Help

At ThreatSciences.com, we specialize in assessing and mitigating security risks. Our team of experts offers:

  • Comprehensive risk assessments of your on-premise data centers and cloud tenants.
  • Implementation of best-in-class security techniques and processes to protect your valuable business data.
  • Training your staff to be aware of phishing attacks and to sensitize them to information security best practices.

Ransomware attacks are devastating to your business. You do not want to be held hostage by bad actors.

Partner with ThreatSciences.com today to secure your future.


Thursday, January 2, 2025

How AI-Enabled SIEM Can Help Your SOC Staff

A circuit board with intricate patterns features a central microchip displaying an eye-like emblem, set against a backdrop of blue-lit interconnected lines. The circuit board's smooth texture contrasts with the reflective surface of the emblem. The image captures a technological scene with a top-down view, focusing on the microchip as the main subject. The color scheme is predominantly blue, creating depth through varying shades. Partner with ThreatSciences.com today for leading MSSP and fractional CISO services to secure your organization's future.

Sophisticated adversaries continually exploit gaps in traditional cybersecurity defenses. Legacy SIEMs, reliant on static, rule-based detection, leave teams overwhelmed by false alarms, often missing genuine incidents.

AI-Enabled SIEM Solutions Rise to the Challenge

Modern enterprises need more than legacy SIEM capabilities. AI-enabled SIEM solutions integrate artificial intelligence with traditional rules-based logic to enhance detection, automate responses, and optimize security operations. When deployed and managed by skilled experts, these systems revolutionize cybersecurity resilience.

Staying Ahead of Evolving Threats

While legacy SIEMs handle known threats, they falter against sophisticated attacks like zero-day exploits and Advanced Persistent Threats (APTs). AI-enabled SIEMs use machine learning and behavioral analytics to:

  1. Detect anomalies beyond predefined signatures.
  2. Continuously learn from data to adapt to new threats.
  3. Correlate low-frequency events to uncover complex, orchestrated APT attacks.

By aggregating disparate events into identifiable patterns, AI-enabled SIEMs expose hidden adversaries.

Leveraging Enterprise Data for Better Insights

AI-enabled SIEMs excel at analyzing enterprise-specific data, including historical incidents, network activity, and user behavior. This allows them to:

  1. Learn typical data interactions and traffic patterns.
  2. Detect precise outliers, reducing false positives and missed alerts.
  3. Provide actionable insights for faster, more effective responses.

Integrating External Threat Intelligence

To extend their reach, AI-enabled SIEMs incorporate curated external threat intelligence feeds. This proactive approach correlates emerging global threats with enterprise-specific data, strengthening threat mitigation and prevention.

Reducing Staff Burnout Through Automation

Alert fatigue undermines security teams. AI-enabled SIEMs mitigate this by:

  1. Using machine learning to prioritize high-risk, relevant alerts.
  2. Automating routine threat detection tasks.
  3. Allowing teams to focus on genuine risks, improving efficiency and morale.

Measuring Success: Key Performance Indicators

Quantifying the impact of AI-enabled SIEMs involves tracking critical KPIs:

  1. True Positives (TP): Detecting actual threats improves by 15–25%, thanks to advanced machine learning models capable of identifying multifaceted attacks.
  2. False Negatives (FN): Missed incidents decrease by 10–20% as AI learns from historical data and detects anomalies overlooked by legacy systems.
  3. False Positives (FP) : Resource-draining false alerts drop by 20–40% through contextual analysis and refined detection algorithms.

Partnering for Success

Deploying and maintaining AI-enabled SIEMs requires specialized expertise. Managed Security Service Providers (MSSPs) like ThreatSciences.com offer critical support to:

  1. Select the ideal SIEM to meet strategic business and technical needs.  ThreatSciences.com suggests Rapid7 Insight IDR SIEM.
  2. Seamlessly integrate the SIEM into your network or enhance existing systems.
  3. Provide ongoing support to maximize ROI and adapt as your network evolves.

ThreatSciences.com also delivers:

  1. Security Analysts: For incident validation and alert management.
  2. Infrastructure Engineers: To ensure system performance and reliability.
  3. Project Management: Tailored to cybersecurity domain requirements.

Transform Your Security Strategy

AI-enabled SIEMs empower enterprises to stay ahead of cyber adversaries. With ThreatSciences.com’s expertise, your organization gains unparalleled visibility into emerging threats and the tools to build a resilient security posture.

Partner with ThreatSciences.com today to secure your future.


I provide technical advisory services to ThreatSciences.com, leveraging decades of expertise in wireless telecommunications industry.

Outside work, I enjoy hiking, writing, and spending time with my family.

Sunday, December 15, 2024

Generative AI Aiding Accessibility, Quickly

A small, white robot stands beside a wooden easel in a sunlit forest. The robot has a backpack and is holding a paintbrush in its right hand. It appears to be observing the scene in front of it, which includes a small stream with a waterfall and a mossy bank. The background is a lush, green forest with tall trees and dappled sunlight.

When I write blog and social media posts on Bluesky and Mastodon, I want a tool to analyze a supplied image and generate concise and accurate descriptions, called "alt-text," to benefit visually impaired readers and help me streamline the publishing effort.  Thankfully, generative AI tools have expanded their ability to process and interpret images, making them ideal for this use case. 

Not All LLMs Are Created Equal

The number of available LLMs has surged, each with distinct features and capabilities. Their performance, however, can vary significantly, even when you use the same prompt and image. These differences are often attributed to the “black box” nature of LLMs, i.e. differences between their internal structures and how they are trained.

Choices, Choices. Which One Should I Use?

I processed a set of test images with prompts across multiple models, then evaluated the generated alt text for accuracy and presentation. This subjective critique helped me choose the best model to meet my needs. You also learn which model is best in class regarding image processing and recognition.

Evaluating LLMs for Alt Text Generation

I selected four LLMs capable of image processing and recognition:

  1. OpenAI ChatGPT-4.0
  2. Cohere Command R Plus 08/2024
  3. Anthropic Claude 3.5
  4. Google Gemini 1.5

All models were tested in their free versions without subscriptions.

Prompt Engineering for Alt Text Output

Prompt engineering techniques ensured consistent output over multiple test runs and meaningful comparisons between the models. While I crafted a universal prompt for ChatGPT, Claude, and Gemini, I designed a distinct prompt for the Hugging Chat hosted Cohere model due to its reliance on an external tool, CogVLM, for image processing. 

Example Prompt for Cohere:


        You are a skilled social media manager and blog author. Use the CogVLMv1
        Image Captioner tool to analyze the uploaded image. Rewrite its output
        into a concise and descriptive alt text for visually impaired readers.
        Provide factual image description. Include objects, background,
        interactions, gestures, poses, visible text, frequency. Describe colors,
        contrasts, textures, materials, composition, focus points, camera angle,
        perspective, context, lighting, shadows. Avoid subjective
        interpretations, speculation. Exclude introductory text, comments and
        administrative details. Write in English using its active voice, limited
        to five sentences. If the tool cannot be invoked, state: “Image
        Captioner tool not invoked, please try again in a new chat.” If the tool
        returns an error, relay the error message verbatim for troubleshooting.
      

Example Prompt for ChatGPT, Claude, and Gemini:


        You are a skilled social media manager and blog author. Analyze the
        supplied image to create concise and descriptive alt text for visually
        impaired readers. Provide factual image description. Include objects,
        background, interactions, gestures, poses, visible text, frequency.
        Describe colors, contrasts, textures, materials, composition, focus
        points, camera angle, perspective, context, lighting, shadows. Avoid
        subjective interpretations, speculation. Exclude meta-text, comments, or
        administrative details. Write in English using its active voice, limited
        to five sentences. If image processing fails, state: “I could not decode
        an image. Please try again.”
      

How Did They Compare?

  1. ChatGPT 4.0 - ChatGPT excelled in accuracy and descriptive quality, delivering alt text that was clear, concise, and contextually appropriate. It consistently met my expectations and outperformed others in execution speed, making it the top performer.

  2. Cohere Command R Plus - Cohere demonstrated strong performance, producing alt text comparable to ChatGPT. However, its reliance on the external CogVLM tool added complexity, and its execution speed was noticeably slower than ChatGPT.

  3. Anthropic Claude 3.5 - Claude’s output was solid but fell short of the top two models. Its alt text tended to adopt a third-person tone, such as “The image depicts people doing stuff and enjoying themselves,” which felt less natural compared to ChatGPT’s more direct descriptions like “People doing stuff and enjoying themselves.”

  4. Google Gemini 1.5 - Gemini ranked last. While it handled some images well, I noticed it occasionally hallucinated—generating descriptions that didn’t match the image. Additionally, this model refused to process images containing people, a significant limitation for creating alt text for diverse content.

ChatGPT Leads the Pack

From a qualitative standpoint, ChatGPT-4.0 and Cohere emerged as the image processing and recognition frontrunners. However, ChatGPT’s faster processing speed and ease of use gave it the edge in overall performance.

Generative AI makes social media and blog content more inclusive by helping improve accessibility for visually impaired readers while decreasing the publishing effort for creators.

Sunday, June 30, 2024

Prompt Engineering a Cover Letter

Tech Writer

Using Large Language Models (LLMs) can accelerate the generation of targeted documentation, ensuring it is free from spelling and grammar errors. These tools boost productivity, allowing you to focus on more valuable tasks.

Cover letters often require significant time and consideration. They must be error-free and tailored to a specific reader, showcasing how your experience and skills add value. This process requires time, effort, and skill, highlighting the value of LLMs in streamlining it.

In this blog post, I will present the technique I use to produce custom cover letters based on specific job descriptions and resumes.

Are cover letters worth the effort?

There are differing opinions on the merits of cover letters. I choose to submit them when given the opportunity. Cover letters allow you to highlight and elaborate on areas in your background that pertain to the job description, going beyond the resume's bullet points.

Using LLMs to produce targeted cover letters reduces effort while providing documents that are free from spelling and grammar errors. These letters are concise, professional in tone, and highlight relevant areas in your background required for the role.

Prompt engineering a cover letter

In a prior blog post, I explored the concept of prompt engineering to increase the quality and reliability of generative AI applications and content produced by LLM. Here is the prompt I engineered for this task:

You are a knowledgeable assistant well-versed in the English language. Write a cover letter for a job application. What follows are two distinct sections. The first section is the formal job description outlining expectations for the position. The second section is my resume. Compare the job description in the first section against my resume in the second, and generate a custom cover letter. The wording of the cover letter should showcase, emphasize, and elevate my relevant experience. Utilize keywords from the job description section to generate the cover letter. Ensure the cover letter addresses the mandatory job requirements. When generating the cover letter, use concise English language in the active voice and maintain a professional tone. The name of the hiring manager is unknown. When you emit your output, emit it using only plain text and contained within a code block.
Section One: Job Description:

Section Two: Resume:

I recommend adding additional context within the above prompt, instructing the LLM to emphasize key points in your background that are relevant for the position, and de-emphasize those areas that are not relevant. If you choose to do this, I recommend inserting this additional context after the sentence, "The wording of the cover letter should showcase, emphasize, and elevate my relevant experience."

Cut and paste the prompt into your preferred LLM's chat interface, and insert the job description in section one, your resume in section two, and press submit.

The First Draft: Raw AI Output

Now the LLM should produce a cover letter. Cut and paste the LLM draft output into a word processor.

The Second Draft: Your Input is Required

Using the word processor, you should

  1. ensure the letter references key points in your background and it is aligned to the job description,
  2. ensure that letter touches upon the mandatory job requirements,
  3. add in additional content that elaborates on your experience, skills and how both will bring value to the role, and finally
  4. review the cover letter for accuracy and tone.

Once you are comfortable with your edits, cycle it back through the LLM.

  1. Cut and paste the following prompt into the LLM chat window,
  2. At the end of the prompt, append your revised cover letter,
  3. Press Submit.
Critique the following draft cover letter. Identify areas in the cover letter that address the mandatory job requirements, and identify areas in the resume that relate to mandatory job requirements that are not reflected in the cover letter. Using your critique, revise the cover letter, ensuring that 1) mandatory job requirements that match my background and skills are represented and 2) keywords in the job description are incorporated. Use concise English language in the active voice and maintain a professional tone.

Draft cover letter:

Notice the prompt specifies the LLM to highlight alignment to mandatory job requirements. This alignment should be a part of the critique the LLM generates.

Iterate Until You're Comfortable

I use multiple critique cycles until I am comfortable with the final draft.

Then I upload the cover letter into the ATS system.

What LLMs are the Best for this Task?

I have found the following LLM models provide content that meets my expectations, and provides consistent formatting output:

  1. OpenAI GPT-4
  2. Google Gemini

If you are technically oriented, and prefer using open source LLMs, I recommend experimenting with Meta's Llama-3 model at the HuggingFace chat site.

Reflections

I hope you find this technique useful. Drop a comment on this blog, or on my LinkedIn or Mastodon feeds. I would love to hear from you!

Recent Posts