What is robots.txt? The Essentials

If you’re new to SEO, some of the more technical aspects of it can seem a bit confusing and overwhelming, especially if you’re not from a technical or web development background to begin with. Robots.txt is one such aspect of Technical SEO that causes more than a few head-scratches, so we got one of our Senior SEO Account Managers to take you through it. Let’s start with the basics; what exactly is robots.txt?

Robots.txt is a text file on your website that instructs web robots and crawlers on which pages they should or should not access. When it comes to search engine agents, the robots.txt file allows us to stop Googlebot, Bingbot, and other crawlers from accessing certain areas of your site, and better manage the crawl budget.

The robots.txt file is part of a number of tools that website owners and developers can use to implement the Robots Exclusion Protocol, alongside X-robots-tags, robots meta tags, and rel attributes.

Read on to find out more about how and why we use robots.txt files.

How does robots.txt work?

Robots.txt is a simple text file, without any HTML markup. It is hosted on the web server, located at the root of your domain, and it is publicly accessible. If a website has a robots.txt file, you will be able to find it by typing in the URL followed by /robots.txt

Robots.txt is the first file that search crawlers read after reaching a domain. This file provides bots with information on how to crawl the website and what pages, resources, or folders they should not crawl. If the bots do not find a robots.txt or if the file does not contain any disallow directives, it is implied that they can crawl all the links found on the domain.

The file contains lines of text. Each line specifies a rule for one or more crawlers, allowing or disallowing their access to specific file paths on the domain.

Wildcat’s robots.txt file indicates that all crawlers can access all URLs on the site. It also points to the specific location of the XML sitemap.

What is robots.txt used for?

The main goal of the robots.txt file is to manage good* bot traffic and activity, so the crawl budget is used effectively and servers do not become overloaded. There most common uses of robots.txt include allowing and disallowing specific agents, directories, or files and specifying the location of your sitemap.

*more detail on this point under Limitations

How to create a robots.txt file

Many website builders will create a robots.txt file by default. Here is how you can create your robots.txt file if your website does not already have one, and how you can optimise your existing file.

To create your file open a text editor like Notepad or Sublime Text.
Use valid syntax to specify your agent and rules.
Save the file with the name robots.txt, making sure that the file is saved in plain text format (not rich text or any other format).
Test your file using an online validator like the Merkel robots.txt testing tool.
Upload the file to the root directory of your website using FTP or your hosting control panel.
Once your file is uploaded, you can also test for errors and warnings with Google’s Search Console robots.txt tester or paid crawling tools like Sitebulb and Screaming Frog
You should also check your Coverage report in Google Search Console for any instances of URLs blocked by robots.txt.

Syntax

The robot.txt file is structured as a series of lines, where each line contains a single field specifying a user-agent, allow directive, disallow directive, or sitemap location. The order of these fields matters to the proper understanding of the file. Below, we will outline the most important rules to follow when writing your robots.txt file.

User-agent

Defines the web crawler or user agent that the rule applies to. It can be a specific agent or a wildcard (*) for all agents.

Disallow

The disallow command is the most commonly used directive in robots.txt. It tells crawlers to omit certain areas of the site. It can be used to:

Block access to a whole website

User-agent: *

Disallow: /

Block access to a single file or page path

User-agent: *

Disallow: /wp-login.php

Block access to a whole directory

User-agent: *

Disallow: /wp-admin/

Block access to private pages

User-agent: *

Disallow: /my-account/secret-info

Block the crawling of search query string parameters and other irrelevant dynamic and static URLs

User-agent: *

Disallow: /shop/?query=*

Allow

The allow command does just that – it allows bots to access certain pages or directories. Because bots will always follow the most specific command on the file. The allow directive can be used to, for example, allow access to a specific page within a disallowed directory. Or, to allow one crawler access while disallowing all others.

User-agent: *

Disallow: /wp-admin/

Allow: /wp-admin/admin-ajax.php

Sitemaps

Adding a link to the XML sitemap in your robots.txt file helps readers find all your pages and understand what you deem to be the most important links on your site.

User-agent: *

Disallow:

Sitemap: https://wildcatdigital.co.uk/sitemap_index.xml

Crawl delay*

The crawl delay directive can be used to tell the user agent to wait for a specified number of milliseconds between crawl requests. This helps avoid overtaxing the server.

*While Bing and Yandex still recognise this directive, Google no longer does. However, the crawl frequency for Google bot can be set through Google Search Console.

Field order and grouping in robots.txt

Understanding the logic robots use to read your file can help you write effective rules.

You can group directives. Every directive should be on a separate line. Every directive between one user-agent field and the next user-agent field applies to the first user-agent.

Snippet from Semrush’s robots.txt file. All of the directives below the user agent field apply to that user agent. In this case, all user agents.

You can define rules that apply to multiple users by stacking user-agent lines for each relevant crawler before the rule.

In this configuration, all other bots will follow the first group of rules, where no disallows are in place. Both Yahoo and Yandex will follow the second group and will not crawl any of the pages in the domain.

Crawlers read groups from top to bottom.
Only one group of rules can apply to a particular crawler. When several groups are applicable, the crawler will follow the most specific set of rules for that user-agent, based on the length of the rule path.

Because both rules apply to Googlebot (the disallow rule applies to all bots), Googlebot will follow the second, more specific directive.

When two specific and conflicting rules apply to the same user agent, Google and major search engines will opt for the least restrictive rule.

This example, from Google’s documentation, specifies that in this case, the Google bots will follow the allow directive.

The (*) Wildcard is used to match any sequence of characters. It can be used for user-agent and all directives, except sitemap. This wildcard is used to simplify directives, as the pattern matching allows us to apply rules to any URL matching a specific pattern, without having to list them individually.

The example above disallows crawling for all dynamic shop search URLs.

The ($) Wildcard can be used to match the end of URLs. This is useful when trying to block access to a specific type of file, for example.

Disallows all URLs ending in .php

Disallows the root URL without disallowing lower-level URLs like /root/file.

The (#) symbol can be used to mark comments, which will be ignored by crawlers.

Robots.txt Limitations

For all its useful implementations, the robots.txt file has certain limitations that are important to know about before making any changes.

Robots.txt does not enforce directives

It is important to note that the commands contained in the robots.txt file are directives, not rules. This means that malicious bots and crawlers can choose to ignore these directives. While you can rely on Google, Bing and most good bots to follow these directives, you must employ alternative methods to truly protect sensitive content on your website, like password-protecting files.

Disallowed pages can be indexed

The disallow directives on the robots.txt file stop search engine crawlers from reading the content of the disallowed pages. However, when these pages are linked to from other crawlable pages, they may still be indexed and appear in search results.

No index directives in the robots.txt file are not supported by Google, and the robots.txt file directives should not be used to manipulate search results.

To reliably prevent certain pages from appearing in search results, we can use noindex robot directives on the necessary pages.

Need Help With Your robots.txt File?

Our team of technical SEO specialists at Wildcat Digital have a wealth of experience setting up websites for success. Checking that your robots.txt is set up correctly and following best practices is a key step in our technical audits and campaign planning. If you need help with your robots.txt or have any concerns about the indexing and crawling of your website, get in touch today.

Post by

Wildcat Digital

Will Hitchmough

Founder

Our founder, Will Hitchmough, worked at a number of high profile Sheffield Digital Agencies before founding Wildcat Digital in 2018. He brings an extensive knowledge of all things related to SEO, PPC and Paid Social, as well as an expert knowledge of digital strategy.

Digital Marketing can be a minefield for many businesses, with many agencies ready to take your money without knowing how to deliver results. I founded Wildcat Digital to deliver digital success to businesses with smaller budgets in a transparent way.

Rich Ayre

Head of Growth

Rich joined us in May 2024 to head up our growth team. With years of experience helping other agencies to grow, Rich joins us at an exciting time as Wildcat is working on a five-year plan to become one of the biggest agencies in the UK.

Outside of work, Rich is a father to three children, which keeps him very busy! He’s also recently started running again to keep fit and loves a bit of DIY.

Sarah Tyree

Head of Digital

Sarah joined Wildcat in January 2025, bringing over seven years of SEO expertise to the team. With a background in Fashion Communication and Promotion, she has worked both in-house and at agencies, covering a range of digital marketing specialisms before focusing on SEO.

Passionate about all things search, Sarah thrives on helping brands grow their online presence.

Outside of work, she enjoys walking her dog, running, and shopping for vintage clothing.

Amelia Ashman

Office Manager

Amelia joined Wildcat Digital in January 2025, bringing extensive experience in HR, Health & Safety, Facilities Management and IT Support. Previously an Operations Manager at The University of Sheffield, she has a strong background in creating efficient and well-organized work environments.

Specialising in HR, Health & Safety, and Facilities Management, Amelia ensures the Wildcat Digital team has the resources and support needed to thrive. Whether managing office operations, maintaining compliance, or fostering a positive workplace culture, she keeps everything running smoothly.

Outside of work, Amelia loves trying new things, traveling, camping, and walking. She also enjoys socialising and exploring new places with friends and family. Her adventurous spirit and proactive approach make her a valued member of the team.

Siena Russell

Client Success Coordinator

Siena joined us in 2023 with a background in sales and digital marketing. She leads on client relationships across the company, ensuring that our customers are happy throughout their journey with us, from their initial consultation through to onboarding and beyond.

Outside of work, Siena enjoys travelling and getting stuck into the local culture. She likes to make the most of her experiences and particularly enjoys watching sunrises and sunsets from beautiful locations around the world.

Paul Pennington

SEO Account Director

Paul has a strong background in SEO, having previously founded and ran a successful eCommerce business, as well as running a personal blog that achieves an average of 17K users per month. Paul’s knowledge of SEO is extensive, with a strong emphasis on client handling and technical SEO.

Outside of work, Paul enjoys spending time with his family and staying active with weight lifting and combat sports.

Dariusz Baczyk

Team Lead & Technical SEO Account Manager

With a degree in Computer Science and SEO experience dating back to 2017, Dariusz has a wide range of SEO skills and knowledge. His specialist knowledge of Technical SEO has firmly landed him the title of Wildcat’s Technical Wizard, and he has recently taken on the responsibility of Team Leader for the Panthers Team.

In his spare time, Dariusz loves hiking, experimenting and trying new coffees and loves learning new things. He is currently learning more about CRO and AI and how this could benefit our clients.

Molly Sturgeon

Team Lead & Senior SEO Account Manager

With a background in sales, Molly is a natural Account Manager, brilliantly handling any issues that come her way. Having joined us as a Digital Marketing Executive, and working part-time through her final year of University, Molly is a shining example of how hard work pays off. She is now an SEO Account Manager with a particular interest in Content and Client Management.

In her spare time, Molly loves to get out in nature, hiking and exploring the Peak District. She also loves cooking and likes to unwind with a bit of yoga.

Libby Oldale

PPC Team Leader

Libby joined Wildcat in 2021 as our first PPC hire. With a degree in Digital Media Production, a Master’s in Digital Media Management and previous experience in Social Media Management, Libby hit the ground running and has since climbed the ranks to Senior PPC Account Manager and has a particular interest in the eCommerce sector.

Outside of work, Libby likes gaming, and cooking and likes to keep active by lifting weights.

Jamie Stowe

Senior SEO Account Manager

With a degree in Film and TV production, and a varied career history, Jamie made the move to marketing with a Masters degree in Digital Media Management. He has since worked in SEO at Agencies across Sheffield, before joining Wildcat and working his way up to SEO Account Manager. Jamie has a particular interest in backlinks and Digital PR and has recently gained a client a valuable backlink from Forbes!

In his spare time, Jamie is an avid foodie and loves trying new restaurants and cuisines. He also loves to travel and spent a year travelling to Australia after university.

Jasmine Savery

SEO Account Manager

Jasmine joined Wildcat in 2022 with a strong background in SEO and Account Management. At the time, she was finishing up a Level 4 Apprenticeship in Digital Marketing from the Chartered Institute of Marketing, and has since worked her way up to SEO Account Manager. Jasmine excels at content writing and promotion, and particularly enjoys finding creative ways to join the dots on multi-channel campaigns.

In her spare time, Jasmine volunteers at a charity, helping combat loneliness & social isolation experienced by older neighbours. Outside of Wildcat, she owns a catering company, Savery Grazing, creating delicious grazing tables & platters for a range of events. She also loves skiing and exploring the Peak District.

Thea Chapman

SEO Account Manager

Thea has a wealth of experience in SEO, having previously worked for other Digital Marketing Agencies in Sheffield. She has a particular interest and skills in Technical SEO, but is more than willing to get stuck in and give anything a go.

Outside of work, Thea spends most of her time with her children, but also loves reading, photography and gardening.

Masilda Hysi

PPC Account Manager

Masilda joined the Wildcat team in October 2024 with over seven years of experience in digital marketing. She specialises in Google Ads, but is also certified in Google Analytics, YouTube Ads, Google Ads for Ecommerce and Apple Search Ads. She has extensive expertise in performance marketing, display advertising, online lead generation and market planning.

In her free time, Masilda likes staying active, cooking, trying new restaurants and exploring new places.

Jon Herdman

Senior SEO Executive

After spending ten years managing businesses, restaurants, cafes and event spaces across Sheffield, Jon decided to change careers and joined Wildcat as an SEO Executive in 2022. He especially enjoys the client management side of the job, helping them to understand digital marketing and ways in which they can build their business’s presence online.

Outside of work, Jon likes to keep fit with running, badminton and football, and also loves music.

Andy Blanchard

Senior SEO Executive

Andy joined Wildcat in 2023 after starting his digital marketing career in-house for a local Sheffield company. Since joining, he has developed a strong interest in Technical SEO and has strong skills in Account Management.

Outside of work, Andy loves music and plays in a couple of bands. He also enjoys rock climbing, cycling, photography and good food.

Kezia Humphries

Senior SEO Executive

Kezia joined us in July 2024 after completing a CIM Certificate in Digital Marketing and gaining experience in Content SEO at another Sheffield agency.

In her spare time, Kezia loves to get outdoors, bouldering, hiking and travelling.

Alex Hickling

Senior PPC Executive

Alex joined Wildcat Digital in December 2024 as a Senior PPC Executive, bringing a strong background in Paid Media, Paid Social, and Programmatic advertising. With a degree in Business & Marketing and Google Ads certifications, she has the expertise to craft high-performing campaigns that drive results.

Before joining Wildcat Digital, Alex worked at two leading agencies in Leeds, honing her skills across various digital advertising platforms. Her analytical mindset and strategic approach help businesses maximize their online presence and advertising budgets.

Outside of work, Alex enjoys spending time with her dog, Lola, and going on walks with her dog walking group. She’s also a keen footballer and loves playing five-a-side whenever she gets the chance. Her enthusiasm and team spirit make him a great addition to the Wildcat Digital team.

Amy Varley

SEO Executive

Amy joined Wildcat in 2024 with a background in journalism, having worked as a News Editor and Editor-in-Chief at The Sheffield Tab. She is naturally interested in Content SEO and research, so will no doubt prove to be a content power-house.

In her spare time, Amy loves watching crime shows, listening to music and hanging out with her dog, Eddie!

What is robots.txt?

How does robots.txt work?

What is robots.txt used for?

How to create a robots.txt file

Syntax

User-agent

Disallow

Allow

Sitemaps

Crawl delay*

Field order and grouping in robots.txt

Robots.txt Limitations

Robots.txt does not enforce directives

Disallowed pages can be indexed

Need Help With Your robots.txt File?

Wildcat Digital

Like what you see?

Will Hitchmough

Rich Ayre

Sarah Tyree

Amelia Ashman

Siena Russell

Paul Pennington

Dariusz Baczyk

Molly Sturgeon

Libby Oldale

Jamie Stowe

Jasmine Savery

Thea Chapman

Masilda Hysi

Jon Herdman

Andy Blanchard

Kezia Humphries

Alex Hickling

Amy Varley

More blogs.

June 2025’s Digital Marketing News Roundup

How to Rank in Google’s Map Pack: A Local SEO Guide for Businesses

May 2025’s Digital Marketing News Roundup