The Azimuth Project
Experiments in how to do a literature search

How to get more citations using your pocket watch and download research papers for free

There have been several posts recently related to how journals are getting academics to publish papers based on state funded research, and then selling academics back their own research (which they presumably purchase using state funds).

hungry snake

If only the above snake was being cooked in a frying pan by Elsevier, it might better depict the current state of affairs. Sounds delicious doesn’t it? Today we’ll see how research doesn’t have to be feed back to us this way.

I’m happy to say that at least in a few areas of research, we’ve managed to largely avoid dependence on having access to these “bad-guy” journals. I’m also going to show you how to conduct a literature search using a free tool, and how to get cited more, using only a pocket watch. This last bit is no joke, statistics predicts that this is actually possible! There’s an obvious way to fix this current glitch in the system, so it hopefully won’t last much longer.

I’ve worked with a few people who found it very helpful to hear this explanation, and hopefully writing it down will help spread the word on how to use these excellent tools, and to make sure everyone’s research is available without a journal subscription. What prompted me to write this page was that Kyle Grist of Portland State recently got in touch. He’s getting started doing some quantum physics research in the area of adiabatic quantum computing. I thought, “what’s something I could tell him half way across the world that would actually be useful to him?”. I figured I’d suggest how to conduct a literature search. This might save him a bit of time; it would have saved me time if someone told me how this works when I was in his shoes. It’s also easy to learn.

The arXiv

There is an amazing tool, which almost everyone reading this would already know about. It’s an online research portal known as the e-Print archive. When researchers finish papers and submit them to a journal, nowadays, it’s standard practice in several disciplines that you will also submit your paper to the arXiv. Papers are classified into subjects, and these papers will be available online for anyone with an internet connection to download free of charge.


One of the interesting things about this service are that the document source code is typically uploaded. If for some reason you don’t have enough bandwidth to download a PDF so you’d rather get the source and compile yourself, or if you want to see the LaTeX of your favourite papers and learn how they typeset those magnificent formulas, you simply click on “Other formats” in any abstract listing. Another appealing feature is that the arXiv adds the familiar date-stamp, written together with a subject classification vertically on the left margin of the first page.

Storing the papers on the arXiv also allows researchers to reference your work, independently of if it’s published in a journal. For instance, here is a paper that I picked today while writing this post. I can cite it as

It appeared just a few days ago on January 21st.

Sounds like an interesting paper, but I have not read it. One thing to notice is that now that we’re referencing this paper in this blog article, by setting a so called trackback, the arXiv page will link to this blog. This is displayed in the bottom right hand corner, like this

blog link

There are a few ways to search the arXiv. For example, you can search “recent” or “new” uploads in “quantum physics”. On the first page of the arXiv, you can scroll down and click on recent

blog link

You’ll then see a list of papers recently posted to the arXiv. This is how I found the paper I cited above. I then went to the abstract listing of this paper by clicking arXiv:1301.4956

blog link

The arXiv has what they call, a submission window. You can submit an article at any time, but the new postings are not made every day, but instead are accumulated over several days of a given submission window. The order of an article appearing in this list depends on when the article was submitted. It seems their using a first in, last out data structure (FILO).

Something you might not know is that people have actually been able to correlate the relative position an article appears, with the articles impact!

  • “We confirm and extend a surprising correlation between article position in these initial announcements… A pure ”visibility“ effect was also present: the subset of articles accidentally in early positions fared measurably better in the long-term citation record than those lower down…”

See the following for more details,

We’ll talk a bit more about the arXiv’s new paper list below. When reading a paper for the purpose of doing research, people are often concerned with two things. The first is finding some of the papers referenced in a given work. The second is finding the list of papers that build on the ideas you’re reading about. There is a nice tool to determine both of these things.

ADS Abstract Service

The so called SAO/NASA ADS Physics Abstract Service provides an articles citation web. The citation web includes links to all of the articles a document cites, and also all of the articles citing a given document. You can get to this page from the abstract listing on the arXiv.

blog link

Clicking on NASA ADS takes you to the ads listing, with some useful links in the upper left-hand corner as

blog link

This paper is a new paper, and does not have any articles citing it yet. However, you can click on References in the Article to get list of the papers (typically with links) that this article cites. Let’s click on the 5th article in the list, and take a look at its ADS listing.

blog link

Here we can again explore not only the references to other work contained in this paper (references in the article), but the references to this paper from other work (Citations to the Article (395). This paper by Farhi et al. turns out to be one I’ve read sometime ago.

blog link

Now, if someone was starting research on this subject, they could go through and take a look at all 395 references, gathering those that are most relevant to what they’re interested in. It’s useful and I find the ADS citations list sometimes contains more references and sometimes a different list than Google Scholar (which we’ll talk more about below).

Some issues with the current tools

Although these tools are fantastic, and the developers should feel proud to have changed so much related to how we do research and the availability of knowledge, there are a few small issues that I think, if fixed, would improve things greatly.

Instead of the ADS Abstract Service I mentioned above, probably most of you are instead familiar with Google Scholar. It’s a nice tool.

One of the things I personally would like to see in Google Scholar is the ability to display more details of an authors citation habits. For example, new tools are being developed to check things such as self-citation using 3rd party (non-Google) tools, such as

reported in the publication

Alas, I couldn’t find the publication by Couto et al. on the arXiv, but at least it seems freely downloadable from the journal webpage. I should also mention that this tool uses data from google scholar, but accessing this data seems slow for 3rd party tools to search. Hopefully this could be improved a bit.

The point of the service is that it allows one to probe the self citation habits of authors. The service provided by Google is free, so we can’t complain to much. The paid services I know of provide this feature however. It can be an important statistic as the algorithms Google Scholar uses to find papers sometimes can count citations to even documents that are not research publications that some authors place on their webpages. This opens the door to abuse the system, so in my point of view it should be changed.

Clearly none of these tools are going to work very well for older papers that are not online. However, as time goes on many of the older papers are starting to make their way to the web, so that’s not a real issue.

Now, that first in last out arXiv new article list I told you about has caused a few people I’ve encountered to wait until the very end of a given time window for arXiv submissions to post their papers. The last person to submit their paper will be the first in the list, etc. I think this shcould be quickly fixed by producing a random list of the new postings.

Returning to the issues about using state funding to buy research articles about research done using state funds, you see that the arXiv is one way to avoid all of this. For some reason, this all reminds me of that movie, Fight Club where Edward Norton and his imaginary alter ego Brad Pit (as Tyler Durden) are robbing the dumpster at a plastic surgery outfit to get the needed fat for soap production. The fat was assumed to be removed using the process known as liposuction, and Tyler Durden turns to Norton and says

  • “We’re selling rich women their own fat 4#&!!7 back to them”

category: experiments