Giving life to dead: role of WayBack Machine in recovery of dead URLs
Data Technologies and Applications
ISSN: 2514-9288
Article publication date: 6 July 2023
Issue publication date: 15 April 2024
Abstract
Purpose
The purpose of the present study is to identify the active and dead links of uniform resource locators (URLs) associated with web references and to compare the effectiveness of Chrome, Google and WayBack Machine in retrieving the dead URLs.
Design/methodology/approach
The web references of the Library Hi Tech from 2004 to 2008 were selected for analysis to fulfill the set objectives. The URLs were extracted from the articles to verify their accessibility in terms of persistence and decay. The URLs were then executed directly in the internet browser (Chrome), search engine (Google) and Internet Archive (WayBack Machine). The collected data were recorded in an excel file and presented in tables/diagrams for further analysis.
Findings
From the total of 1,083 web references, a maximum number was retrieved by the WayBack Machine (786; 72.6 per cent) followed by Google (501; 46.3 per cent) and the lowest by Chrome (402; 37.1 per cent). The study concludes that the WayBack Machine is more efficient, retrieves a maximum number of missing web citations and fulfills the mission of preservation of web sources to a larger extent.
Originality/value
A good number of studies have been conducted to analyze the persistence and decay of web-references; however, the present study is unique as it compared the dead URL retrieval effectiveness of internet explorer (Chrome), search engine giant (Google) and WayBack Machine of the Internet Archive.
Research limitations/implications
The web references of a single journal, namely, Library Hi Tech, were analyzed for 5 years only. A major study across disciplines and sources may yield better results.
Practical implications
URL decay is becoming a major problem in the preservation and citation of web resources. The study has some healthy recommendations for authors, editors, publishers, librarians and web designers to improve the persistence of web references.
Keywords
Acknowledgements
Future studies: Even though the authors may appreciate the risk of future inaccessibility of Internet references, they cannot easily avoid the use of the internet in their publications (Falagas et al., 2008). Therefore, researchers have to keep a vigilant eye on this issue. In the future, the authors must pursue their studies to ascertain whether accessibility depends on the type of source cited (journals, books), kind of sources (wikis, blogs), access to the source (commercial, open access), quality of the source (peer-reviewed, predatory), an archive of source (like repositories, personal websites) and country of sources (USA, India).
Citation
Loan, F.A., Khan, A.M., Andrabi, S.A.A., Sozia, S.R. and Parray, U.Y. (2024), "Giving life to dead: role of WayBack Machine in recovery of dead URLs", Data Technologies and Applications, Vol. 58 No. 2, pp. 201-213. https://doi.org/10.1108/DTA-06-2022-0242
Publisher
:Emerald Publishing Limited
Copyright © 2023, Emerald Publishing Limited