Agent Browser

LLM-friendly text format

Sending screenshots of web pages to your multi-modal LLM agent works, but it is very expensive. The LLM Browser extracts the text from the rendered web page, and pairs that with which areas of the page are clickable, positions of the text, and other information relevant to LLM agents. Multi-modal agents can requset a low or high resolution rendering of the page to accompany this information.

Side effect detection

Only small fraction of the web is publicly accessible and indexed. The deep web contains lots of valuable information for human users, but it requires the user to be logged in to access. It makes sense to allow an LLM agent to read your emails, but right now if you grant that access it will also be able to send emails. The LLM Browser allows users to grant it access to their personal accounts, but it blocks LLMs from performing any actions that use those creditionals without approval. Using the LLM Browser, an agent could draft a response to your email, and let the user press the send button.

Currently side effect detection is very conservative. But some popular websites are special cased, and data providers can annotate which actions on their site have side effects.

Cache

The LLM browser intellgiently determines whether to show a URL from cache or from the live internet. Cached URLs have lower latency and cost, and don't upset website owners with a flood of non-human traffic. You can manually override the cache behavior, but the default settings should work for most use cases. We have over 2 billion web pages in cache.

Data providers

Register as a data provider and get paid for the data your site produces.

Ethics

Unlike competing products, our browser is not designed to be parasitic on the websites that it visits. It respects robots.txt files, reports its user-agent string accurately, and will not help you to solve CAPTCHAs. Right now much of the internet is supported by ads. LLMs break this business model by taking the content without seeing the ads. Either these websites will need to change their business models, find ways to block LLM browsers, or they will die. You can register your site as a data provider to get paid directly for web traffic served via our browser. But this isn't an extortion situation. We won't steal your data if you don't accept our offer of payment.

If you find that the browser is blocked from a website that you need it to have access to, please reach out directly. We will work with all parties to come up with a good solution.

Dataset

A dataset of web pages rendered into the LLM Browser format will be available soon on hugging face. This can be used to train models to better understand the format. However note that the format was designed to work well with current models that have not been trained for browsing the web.