Saturday, September 29, 2012

The Disappearing Web: Decay is Eating our History



Researchers say that within a year of certain events, an average of 11 percent of the online material that was linked to had disappeared completely.



The Disappearing Web: Decay Is Eating Our History
By Mathew Ingram on September 20, 2012
Bottom of Form
One of the characteristics of the modern media age—at least for anyone who uses the Web and social media a lot—is that we are surrounded by vast clouds of rapidly changing information, whether it’s blog post,s or news stories, or Twitter and Facebook updates. That’s great if you like real-time content, but there is a not-so-hidden flaw—namely, that you can’t step into the same stream twice, as Heraclitus put it. In other words, much of that information may (and probably will) disappear as new information replaces it, and small pieces of history wind up getting lost.
According to a recent study, which looked at links shared through Twitter about news events such as the Arab Spring revolutions in the Middle East, this could be turning into a substantial problem. The study, which MIT’s Technology Review highlighted in a recent post by the Physics arXiv blog, was done by a pair of researchers in Virginia, Hany SalahEldeen and Michael Nelson. They took a number of recent major news events over the past three years—including the Egyptian revolution, Michael Jackson’s death, the elections and related protests in Iran, and the outbreak of the H1N1 virus—and tracked the links that were shared on Twitter about each. Following the links to their ultimate source showed that an alarming number of them had simply vanished.
In fact, the researchers said that within a year of these events, an average of 11 percent of the material that was linked to had disappeared completely (and another 20 percent had been archived), and after two-and-a-half years, close to 30 percent had been lost altogether and 41 percent had been archived. Based on this rate of information decay, the authors predicted that more than 10 percent of the information about a major news event will likely be gone within a year, and the remainder will continue to vanish at the rate of .02 percent per day.
It’s not clear from the research why the missing information disappeared, but it’s likely that in many cases blogs have simply shut down or moved, or news stories have been archived by providers who charge for access (something that many newspapers and other media outlets do to generate revenue). But as the Technology Review post points outhttp://www.technologyreview.com/view/429274/history-as-recorded-on-twitter-is-vanishing-from/?ref=rss, this kind of information can be extremely valuable in tracking how historical events developed, such as the Arab Spring revolutions—which the researchers note was the original impetus for their study, since they were trying to collect as much data as possible for the one-year anniversary of the uprisings.
Other scientists, and particularly librarians, have also raised red flags in the past about the rate at which digital data are disappearing. The National Library of Scotland, for example, recently warned that key elements of Scottish digital life were vanishing into a “black hole” and asked the government to fast-track legislation that would allow libraries to store copies of websites. Web pioneer Brewster Kahle is probably the best known digital archivist as a result of his Internet Archive project Open Library).
Although the Virginia researchers didn’t deal with it as part of their study, a related problem is that much of the content that gets distributed through Twitter—not just websites that are linked to in Twitter posts, but the content of the posts themselves—is difficult and/or expensive to get to. Twitter’s search is notoriously unreliable for anything older than about a week, and access to the complete archive of your tweets is provided only to those who can make a special case for needing it, such as Andy Carvin of National Public Radio (who is writing a book about the way he chronicled the Arab Spring revolutions).
As my colleague Eliza Kern noted in a recent post, an external service called Gnip now has access to the full archive of Twitter content [http://gigaom.com/2012/09/19/for-a-price-gnip-brings-you-access-to-all-public-tweets-ever-sent/], which it will provide to companies for a fee. And Twitter-based search-and-discovery engine Topsy also has an archive of most of the full “firehose” of tweets—although it focuses primarily on content that is retweeted a lot—and provides that to companies for analytical purposes. But neither can be easily linked to for research or historical archiving purposes. The Library of Congress also has an archive of Twitter’s content, but it isn’t easily accessible, and it’s not clear whether new content is being added.
Twitter has talked about providing a service that would let users download their tweets at some point, but it hasn’t said when such a thing would be available—and even if users did create their own archive in this way (or by using tools like Thinkup from former Lifehacker editor Gina Trapani) it would be difficult to link those in a way that would provide the kind of connected historical information the Virginia study is describing. And it’s not just Twitter: There is no easy way to get access to an archive of Facebook (FB) posts either, although users in Europe can request access to their own archive as a result of a legal ruling there.
For better or worse, much of the content flowing around us seems to be just as insubstantial as the clouds it’s hosted in, and the existing tools we have for trying to capture and make sense of it simply aren’t up to the task. The long-term social effects of this digital amnesia remain to be seen.
  http://www.businessweek.com/

Thursday, September 27, 2012

Reading Diary: Open Access by Peter Suber -

Some really interesting comments on Open Access
"...For librarians reading this book****, it is definitely a plus that Suber doesn’t take the condescending route and proclaim libraries and librarians to be casualties of increased OA. On the other hand, libraries as institutions that passively pay exorbitant subscription bills tend to figure more in the text than librarians as active participants, leaders and allies in reforming scholarly communications. Although I’m sure it’s not intended to read this way (and there are a couple of good plugs for libraries & librarians in the last chapter), it’s not hard to imagine faculty members reading this book imagining that their libraries need rescuing rather than coming away with the idea that their libraries are full of librarians who would be happy joining them storming the barricades. Change will happen faster and better if we hang together.............................

Finally, who would I recommend this book to? First of all, this book is a must-have for any academic library. No question about that......"

****we have this book ON ORDER in the RUL Open Access (MIT Press Essential Knowledge)

Thanks to Brenda for this interesting Twitter alert!



Articlefrom the Chronicle of Higher Education
Excerpt: “…….. When I referee an article for a journal, it usually takes three to four hours of my time. Recently, two Taylor & Francis journals asked me to review article submissions for them. In each case, I was probably one of 20 to 30 people in the world with the expert knowledge to judge whether the articles cited the relevant literature, represented it accurately, addressed important issues in the field, and made an original contribution to knowledge.
If you wanted to know whether that spot on your lung in the X-ray required an operation, whether the deed to the house you were purchasing had been recorded properly, or whether the chimney on your house was in danger of collapsing, you would be willing to pay a hefty fee to specialists who had spent many years acquiring the relevant expertise. Taylor & Francis, however, thinks I should be paid nothing for my expert judgment and for four hours of my time.
So why not try this: If academic work is to be commodified and turned into a source of profit for shareholders and for the 1 percent of the publishing world, then we should give up our archaic notions of unpaid craft labor and insist on professional compensation for our expertise, just as doctors, lawyers, and accountants do.
This does not mean we would never referee articles free. Just as the lawyer who is my neighbor bills corporate clients a hefty fee but represents prisoners in Guantánamo pro bono, so academics could referee without charge for nonprofit presses but insist on professional rates of compensation from for-profit publishers that expect us to donate our labor while paying mansion salaries [over $US 1 million/annum] to their chief executives and top managers….”
Want to Change Academic Publishing? Just Say No 1

Wednesday, September 26, 2012

Guidelines needed to prevent impact-factor abuse

Citations play a big part in assessing a journal's quality but what happens when many of those citations come from papers authored by that journal's editorial board? Paul Peters considers the need to establish guidelines for appropriate citation practices