Camera lens reflecting a crowd at a music concert Camera lens reflecting a crowd at a music concert

Weights in Tengrai: Sharpen Your Creative Lens with Attention Syntax

Ever tried taking the perfect event photo only to find your big lens camera’s autofocus has a mind of its own, enthusiastically focusing on a random passerby instead of your smiling friends? Much like a stubborn camera, generative AI can sometimes latch onto the less important details of a scene. But what if you could give your AI the insider scoop on what truly matters in your masterpiece? Enter the creative process of, where you’re not just randomly generating images—you’re fine-tuning them with the precision of a seasoned photographer adjusting their lens’s manual focus ring.


Imagine being able to subtly nudge your AI’s ‘gaze,’ encouraging it to detail the delicate frost patterns on a winter windowpane while glossing over the bustling street scene just beyond. In prompting you can do just that, applying attention weights in your image prompts to highlight or downplay elements as you see fit. This isn’t just a tool; it’s your creative sidekick, poised to help you capture and express your vision with more clarity and better-aligned flair.

Why settle for a haphazard snapshot when you can compose a visual symphony? Let’s embark on a journey through the mechanics of this feature and discover how you can make your AI-generated images not just see but truly look.

Understanding Attention Weights in

Think of crafting images with as writing a vivid story. In this creative process, attention weights function much like punctuation in a sentence. They do not form the words themselves, but they significantly influence how those words are understood and emphasized. While styles and other tools might set the tone or genre of your visual narrative, attention weights adjust the focus, subtly guiding the AI to pay more attention to certain elements and less to others. This selective emphasis helps bring clarity and impact to the scenes you envision, much as carefully chosen punctuation sharpens and refines a written message.

What are Attention Weights?

At its core, an attention weight is a simple yet powerful command that adjusts how much the AI model focuses on specific elements of your prompt. You can imagine it as a spotlight operator at a play, intensifying the light on the lead actor while keeping the background dim to draw the audience’s eyes to where the action is hottest.

A man actor in the spotlight

Syntax and Usage

The syntax for applying attention weights is straightforward: wrap the target word or phrase with parentheses and append a colon, followed by the numerical weight. For example, (sunset:1.5) tells the AI to pay 1.5 times more attention to the concept of a sunset. Conversely, (crowd:0.5) would halve the attention on the crowd, focusing on it less prominently during image generation.

Default Settings

When no weight is specified, the AI assumes a neutral focus, a weight value of 1. This standard setting is like having no bias in the lens—it treats all aspects of the prompt equally. But when you need to make your image tell a specific story, adjusting these weights lets you shift the narrative focus with ease.

Through this mechanism, transforms from a mere tool to an extension of your creative intent, allowing you to emphasize a nostalgic texture in an old photograph or downplay modern elements in a historical setting. Whether you’re crafting a bustling cityscape or a serene nature scene, attention weights give you the control to better align the results with your artistic vision.

Abstract shapes, yellow, black and red colors

By understanding and using attention weights, you empower yourself to push the boundaries of traditional image generation, creating pieces that are not only visually appealing but also more deeply resonant with your personal creative goals.

Proceed with Care: Balancing Attention Weights

While adjusting attention weights can significantly enhance the focus on desired elements, it’s important to avoid setting them excessively high. Over-steering attention can overwhelm the AI, leading to a loss of coherence in the generated images. This could result in outputs that are jumbled and lack recognizable forms or shapes.

Warning sign, yellow and black
Its essential to find a balance to ensure the final image remains clear and true to your artistic vision

Focusing Attention – Practical Examples

One of the profound applications of attention weights in lies in their ability to refine and emphasize the artistic intent of an image prompt. To illustrate this, let’s explore how to adjust our prompt to more accurately capture the style of a legendary photographer.

The Art of Henri Cartier-Bresson and Street Photography

Henri Cartier-Bresson mastered candid photography and is often credited with pioneering photojournalism. His technique, known as “the decisive moment,” captures the poetry and geometry of spontaneous scenes within public places. Meticulous framing and timing, elements that bring a dramatic and almost theatrical quality to everyday street scenes characterize Cartier-Bresson’s work.

Let’s craft a prompt that channels elements reminiscent of Henri Cartier-Bresson’s iconic style, such as candidness, dramatic lighting, and urban setting. The prompt we’ll explore is: “candid street portrait, gritty New York City intersection, man in worn clothing, cracked sidewalk, urban decay, bustling pedestrians, blurred traffic, vehicles, pigeons flying in the background, dramatic chiaroscuro lighting, high contrast, Henri Cartier-Bresson photojournalistic style

With so many details, the essential style of Henri Cartier-Bresson risks being diluted as demonstrated below by the varied results obtained using the initial, unweighted prompt. Some generated images may not prioritize the dramatic chiaroscuro lighting and high-contrast that are crucial for this style.

Results for a street photography prompt, 1
Images lacking the dynamic essence of Cartier Bressons style

Furthermore, important elements like pigeons flying might not be present, and the images could vary in their adherence to the desired photojournalistic style.

Results for a street photography prompt, 1
Some key elements requested in the prompt are missing And the depicted scenes feel static

Revised Prompt with Attention Weights

To address these issues, we can revise the prompt by applying attention weights to guide the AI more precisely:

candid street portrait, gritty New York City intersection, (man in worn clothing: 1.5), cracked sidewalk, urban decay, (bustling pedestrians: 0.5), (blurred traffic: 0.1), (vehicles: 0.25), pigeons (flying: 2.5) in the background, dramatic chiaroscuro lighting, high contrast, (Henri Cartier-Bresson: 3) photojournalistic style

Explanation of revisions:

  • Man in Worn Clothing (1.5): Enhancing focus on the subject to capture the raw, individual character.
  • Bustling Pedestrians (0.5), Blurred Traffic (0.1), and Vehicles (0.25): These elements from the background are de-emphasized, allowing more critical components to stand out.
  • Flying (2.5): This weight ensures that the movement and freedom of the pigeons are captured, adding a dynamic element to the composition.
  • Henri Cartier-Bresson Style (3): This high weight steers the AI to prioritize Cartier-Bresson’s distinctive photojournalistic style, ensuring the images reflect his technique of dramatic lighting and high contrast.

Impact of Revised Prompt

Adjusting the attention weights in the revised prompt enhances alignment with the desired artistic vision, inspired by the iconic style of Henri Cartier-Bresson.

Street photography with attention weights, 1

While it’s important to note that no AI can replicate Cartier-Bresson’s genius, the strategic emphasis on elements such as the style evocative of Cartier-Bresson and the dynamic pigeons, coupled with the de-emphasis on less crucial background details, significantly improves the alignment of the generated images with the intended creative vision.

This approach results in imagery that is not only visually compelling but also resonates more deeply with the nuances of the specified photographic style.

Fine-Tuning a Victorian Ballroom Scene with Attention Weights

Let’s create another example that shows the power of attention weights in refining the output of a image prompt in a complex and thematic scene. Consider this example: “Opulent Victorian ballroom, grand chandeliers, ornate gold mirrors, dozens of guests in formal attire, a live orchestra, intricate wallpaper, flowing champagne, a woman in a red dress.”

The original prompt may lead to images where an overload of details overshadows key subjects, like the woman in the red dress. The elaborate background could detract from highlighting crucial thematic elements, such as the woman’s implied elegance and distinct presence.

Results for a ballroom scene
Obtained images fail to feature the woman in the red dress

To ensure the key elements stand out and the thematic focus remains clear, the prompt can be revised with a strategic attention weight:

Opulent Victorian ballroom, grand chandeliers, ornate gold mirrors, dozens of guests in formal attire, a live orchestra, intricate wallpaper, flowing champagne, (a woman in a red dress: 2.0)

Doubling the attention on the woman in the red dress ensures she is the standout feature of the image, capturing the viewer’s eye and effectively conveying her importance in the scene.

Impact of Revised Prompt

Adjusting attention weights highlight the woman in the red dress as the ballroom’s central figure. This strategic focus shift not only draws attention to her but also allows the opulent details of the Victorian setting to serve as a rich backdrop. This ensures that the final image is visually appealing and accurately reflects the envisioned focus of the scene.

Woman wearing an elegant red dress, ballroom setting, 1

The importance of attention weights in managing the visual hierarchy of a scene is demonstrated in this example, ensuring that key elements like the woman in red dress are prominent within the composition, even amidst a multitude of details.

Conclusion: The Art of Focused Creativity

In the grand theater of life, attention is the diva, demanding the spotlight and directing our gaze to the day’s showstoppers—from the warm smile of a loved one to the alluring aroma of a fresh, steaming “kürtőskalács” cooling on the windowsill.

This concept transcends the realms of poetry and Transylvanian spit cake enthusiasts to also become the star of the artificial intelligence stage, as the cleverly titled paper “Attention Is All You Need” unveiled. Judging only by the title, the research might sound like the latest self-help bestseller, but instead of coaching life transformations, it revolutionized AI with transformers. This paper ignited a new era of generative AI, enabling systems such as myself to convey meaningful content, all by focusing on the right patterns and contexts.

Woman's intense gaze

In a similar vein, fine-tuning the AI’s gaze in allows you to transform a seemingly chaotic array of elements into a compelling narrative, ensuring that details serve your artistic vision. The journey through adjusting prompts with attention weights showed us the power of attention in AI image generation. From the bustling streets captured with the essence of Henri Cartier-Bresson to the elegant drama of a Victorian ballroom, the ability to manipulate focus points is an indispensable tool in the generative digital artist’s palette.

Now, why not take this feature for a spin? Adjust the attention, play with weights, and watch as your images shift from ordinary to extraordinary.

Leave a Reply

Your email address will not be published. Required fields are marked *