Before you start using lookyloo, you should be familiar with a few concepts, otherwise it won’t make much sense.
We strongly recommend to setup and use your own Lookyloo instance, especially if you’re using it a lot or are submitting URLs you’d rather not have publicly available to anyone using the demo instance.
If you’re fine with that, and just want to give it a try, you can start a Lookyloo capture navigating to our demo instance. On the loaded page click Start a new capture.
This is the entry point when you want to investigate a website.
The only required field is the URL, which is the URL you want to capture. It might be the index page of the domain to capture:
Or a more complete URL:
The other fields can be left blank and the default parameters for the instance will be used.
Lookyloo will always use an emulated browser (splash), but you can pass it parameters, such as the user agent.
The user agents listed on the page can come from two sources:
In either case, the most frequently used user-agent will be selected by default, it is the one in the last dropdown menu when you load the page.
If you want to use the user-agent of a specific browser on a specific operating system, you can pick one in the lists.
Or you pass the user-agent of your own browser by ticking the box.
Note: If your browser has a very special user-agent, it may leak information about you.
Or you can use the free-text field to pass the string you want as a user agent.
Note: The capture may fail because the websites refuses the User-Agent.
This section will allow you to pass more advanced parameters to the capture, please only use it if you know what you’re doing, especially if you’re using the demo interface, as some of them may contain private information.
The referer is optional but it must be a full URL if you want to use it.
Usecase: some websites block requests if they’re not initiated by an internal URL. If that’s the case, pass the required URL in the referer field.
Note: It may not be enough and the website may also require a cookie. If it is the case, see below.
This is a free text filed where you can pass a list of HTTP headers
to the capture. They must be in the following format:
<HEADER_NAME>: <VALUE>, and one per line. Anything else will be ignored.
Usecase: Some websites will redirect you to a specific URL if the
Accept-Language header is set by your browser.
The proxy is optional but it must follow the format provided in the field if you want to use it.
Usecase: some websites block requests if they’re not initiated from a specific country, or IP block. If that’s the case, you can pass a proxy to the capture.
Note: You need to setup your own proxy server, or you can use tor as a socks5 proxy.
If you use tor with the default configuration, the socks5 proxy will only be listening
on 127.0.0.1, which is not accessible from splash because is running in a docker package.
In order for it to work, the easiest way is to make tor listening on every interfaces
SocksPort 0.0.0.0:9050 in
/etc/tor/torrc, restart tor, and then you can pass
socks5://<IP_of_your_host>:9050 in the proxy field of Lookyloo.
If it doesn’t work, try opening
http://<IP_of_your_host>:9050 in your browser. You should
have a notification from tor telling you it is a socks5 proxy and not an http proxy.
If you don’t have that, something is incorrect with your tor configuration .
It will be empty by default, but please refer to the related documentation in order to initiate a new capture with a pre-refined cookie.
It is not enabled by default because it will most of the time take too long on most websites. The default depth is 1 (capture only the given URL).
If you enable the feature, a value of 2 will capture the given URL, extract all the links in the page, pick all the ones with the same hostname as the initial URL, and capture them. As you can imagine, it may get huge very fast.
As a general statement, if something in unclear, you should move your mouse over the part of the section that doesn’t make sense, and it should display a text giving some explanations. If it doesn’t, please get in touch with us, or keep reading this page.
Simple example with the initial URL
The node with a thumbnail of the screenshot is the page as it would be displayed in your browser (after all the redirects). Click on the image to see the screenshot fullscreen.
This node contains one single URL, the content of the response is an HTML page
The response from the server has 3 cookies
In order to get to the landing page, we went from a unencrypted URL (http) to an encrypted one (https)
The initial response was empty, it generally means that the redirect was made by the server directly (3XX HTTP code)
The landing page loads resources from two different hostnames (
113 resources are loaded from
24 resources are loaded from
some of the URLs in the node
github.githubassets.comare themselves loading content from URLs on
github.githubassets.com(8 fonts). It will most probably come from the CSS in the parent node.
In order to investigate it further, we can click on each of the hostnames and open an investigation popup, more on that below.
Clicking on the first node
github.githubassets.com opens the following pop-up:
You will see every URL aggregated in that node
You can do a lot of things from there:
Get every resource loaded from the server
See if they are present in other captures (correlation by hash)
See the HTTP status code of the response
Download all the URLs and hashes
Get the cookies received of sent for each HTTP request
Copy individual URLs
If you put pur mouse over the image icon, it will display the image