A Supplementary Tool for Web-archiving Using Blockchain Technology

Authors

DOI:

https://doi.org/10.23962/10539/29194

Keywords:

web-archiving, blockchain, trusted timestamping, non-repudiation, cryptographic hash functions, Merkle trees, DApps, smart contracts

Abstract

The usefulness of a uniform resource locator (URL) on the World Wide Web is reliant on the resource being hosted at the same URL in perpetuity. When URLs are altered or removed, this results in the resource, such as an image or document, being inaccessible. While web-archiving projects seek to prevent such a loss of online resources, providing complete backups of the web remains a formidable challenge. This article outlines the initial development and testing of a decentralised application (DApp), provisionally named Repudiation Chain, as a potential tool to help address these challenges presented by shifting URLs and uncertain web-archiving. Repudiation Chain seeks to make use of a blockchain smart contract mechanism in order to allow individual users to contribute to web-archiving. Repudiation Chain aims to offer unalterable assurance that a specific file and its URL existed at a given point in time-by generating a compact, non-reversible representation of the file at the time of its non-repudiation. If widely adopted, such a tool could contribute to decentralisation and democratisation of web-archiving

References

Adams, C., Pinkas, D., Cain, P., & Zuccherato, R. J. (2001). Internet X.509 public key infrastructure time-stamp protocol (TSP). The Internet Society. https://doi.org/10.17487/rfc3161

Ainsworth, S. G., AlSum, A., SalahEldeen, H., Weigle, M. C., & Nelson, M. L. (2011). How much of the web is archived? In Proceedings of the 11th Annual International ACM/ IEEE Joint Conference on Digital Libraries (pp. 133–136). Association for Computing Machinery (ACM). https://doi.org/10.1145/1998076.1998100

AlNoamany, Y., AlSum, A., Weigle, M. C., & Nelson, M. L. (2014). Who and what links to the internet archive. International Journal on Digital Libraries, 14, 101–115. https://doi.org/10.1007/s00799-014-0111-5

Atzei, N., Bartoletti, M., & Cimoli, T. (2017). A survey of attacks on Ethereum smart contracts. In M. Maffei & M. Ryan (Eds.), Principles of Security and Trust: 6th International Conference (POST 2017) (pp. 164–186). https://doi.org/10.1007/978-3-662-54455-6_8

Bayardo, R. J., & Sorensen, J. (2005). Merkle tree authentication of HTTP responses. In WWW ‘05: Special interest tracks and posters of the 14th International Conference on World Wide Web (pp. 1182–1183). Association for Computing Machinery (ACM). https://doi.org/10.1145/1062745.1062929

Becker, G. (2008). Merkle signature schemes, Merkle trees and their cryptanalysis. Ruhr-Universität Bochum, Germany. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.392.7879&rep=rep1&type=pdf

Berman, P., Karpinski, M., & Nekrich, Y. (2007). Optimal trade-off for Merkle tree traversal. Theoretical Computer Science, 372, 26–36. https://doi.org/10.1016/j.tcs.2006.11.029

Berners-Lee, T., & Fischetti, M. (2001). Weaving the Web: The original design and ultimate destiny of the World Wide Web. New York: HarperCollins. https://doi.org/10.5860/choice.37-3934

Buterin, V. (2014). Ethereum white paper: A next generation smart contract & decentralized application platform. Retrieved from https://cryptorating.eu/whitepapers/Ethereum/Ethereum_white_paper.pdf

Chohan, U. W. (2017). Cryptocurrencies: A brief thematic review. SSRN. https://dx.doi.org/10.2139/ssrn.3024330

CoinMarketCap. (2020). All cryptocurrencies. Retrieved 25 June 2020 from https://coinmarketcap.com/all/views/all/

Conte de Leon, D., Stalick, A. Q., Jillepalli, A. A., Haney, M. A., & Sheldon, F. T. (2017). Blockchain: Properties and misconceptions. Asia Pacific Journal of Innovation and Entrepreneurship, 11(3) 286–300. https://doi.org/10.1108/apjie-12-2017-034

Crosby, M., Nachiappan, Pattanayak, P., Verma, S., & Kalyanaraman, V. (2016). Blockchain technology: Beyond Bitcoin. Applied Innovation Review, 2, 6–19.

Ducut, E., Liu, F., & Fontelo, P. (2008). An update on uniform resource locator (URL) decay in MEDLINE abstracts and measures for its mitigation. BMC Medical Informatics and Decision Making, 8, 23. https://doi.org/10.1186/1472-6947-8-23

Duranti, L. (1992). Origin and development of the concept of archival description. Archivaria 35, 47–54.

Dworkin, M. J. (2015). SHA-3 standard: Permutation-based hash and extendable output functions. Federal Information Process Standards Publication 202, NIST. https://doi.org/10.6028/nist.fips.202

Etherscan. (2020). Ethereum average gas price chart. Retrieved 25 June 2020 from https://etherscan.io/chart/gasprice

Everett, S., Calitz, A. P., & Greyling, J. (2017). The case for a ‘sovereign’ distributed securities depository for securities settlement. Journal of Securities Operations C Custody, 9(3), 269–292. https://doi.org/10.69554/WDCC3003

Goodrich, M. T., & Tamassia, R. (2014). Algorithm design and applications. Hoboken, NJ: John Wiley & Sons.

Haber, S., & Stornetta, W. S. (1991). How to time-stamp a digital document. Journal of Cryptology, 3, 99–111. https://doi.org/10.1007/bf00196791

Habibzadeh, P. (2013). Decay of references to web sites in articles published in general medical journals: Mainstream vs small journals. Applied Clinical Informatics, 4(4), 455–464. https://doi.org/10.4338/aci-2013-07-ra-0055

Hertig, A. (2017). Ethereum 101: Chapter 5: How do Ethereum smart contracts work? CoinDesk. Retrieved from https://www.coindesk.com/learn/ethereum-101/ethereum-smart-contracts-work

Hevner, A. R. (2007). A three cycle view of design science research. Scandinavian Journal of Information Systems, 19(2), 4.

Hevner, A. R., March, S. T., Park, J., & Ram, S. (2004). Design science in information systems research. Management Information Systems Quarterly, 28(1), 75–105. https://doi.org/10.2307/25148625

Internet Archive. (n.d.). [Website]. Retrieved from https://archive.org/

International Telecommunication Union (ITU). (2000). X. 842: Information technology: Security techniques: Guidelines for the use and management of trusted third party services. ITU-T Study Group 7, Telecommunication Standardisation Sector. https://doi.org/10.3403/02617626u

Masanés, J. (2006). Web archiving: Issues and methods. In Web archiving (pp. 1–53). Heidelberg: Springer. https://doi.org/10.1007/978-3-540-46332-0_1

Mackall, M. (2006). Towards a better SCM: Revlog and mercurial. In Ottawa Linux Symposium 2 (pp. 83–90).

Manuel, S., & Peyrin, T. (2008). Collisions on SHA-0 in one hour. In Nyberg K. (Ed.), Fast software encryption: FSE 2008 (pp 16-35). Lecture Notes in Computer Science, Vol. 5086. Berlin: Springer. https://doi.org/10.1007/978-3-540-71039-4_2

McCullagh, A., & Caelli, W. (2000). Non-repudiation in the digital environment. First Monday, 5(8). https://doi.org/10.5210/fm.v5i8.778

Mendenhall, W., & Sincich, T. (2012). A second course in statistics: Regression analysis (7th ed.). Saddle River, NJ: Prentice Hall.

Merkle, R. C. (1989). A certified digital signature. In G. Brassard (Ed.), Advances in cryptology — CRYPTO’ 89 proceedings (pp. 218–238). Lecture Notes in Computer Science, Vol. 435. New York: Springer. https://doi.org/10.1007/0-387-34805-0_21

Murphy, J., Hashim, N. H., & O’Connor, P. (2008). Take me back: Validating the Wayback Machine. Journal of Computer-Mediated Communication, 13(1), 60–75. https://doi.org/10.1111/j.1083-6101.2007.00386.x

Nakamoto, S. (2008). Bitcoin: A peer-to-peer electronic cash system. Retrieved from https:// bitcoin.org/bitcoin.pdf

Naor, M., & Yung, M. (1989). Universal one-way hash functions and their cryptographic applications. In STOC ’89: Proceedings of the Twenty-first Annual ACM Symposium on Theory of Computing (pp. 33–43). Association for Computing Machinery (ACM). https://doi.org/10.1145/73007.73011

Negus, C. (2020). Linux bible (10th ed.). Indianapolis: John Wiley & Sons. https://doi.org/10.1002/9781119578956

Niels, B. (2011). Web archiving – Between past, present, and future. In C. Ess, & M. Consalvo (Eds.), The handbook of internet studies (pp. 24–42). Chichester, UK: John Wiley & Sons. https://doi.org/10.1002/9781444314861.ch2

Offermann, P., Levina, O., Schönherr, M., & Bub, U. (2009). Outline of a design science research process. In Proceedings of the 4th International Conference on Design Science Research in Information Systems and Technology. Association for Computing Machinery (ACM). https://doi.org/10.1145/1555619.1555629

Peffers, K., Tuunanen, T., Rothenberger, M. A., & Chatterjee, S. (2007). A design science research methodology. Journal of Management Information Systems, 24(3), 45–77. https://doi.org/10.2753/mis0742-1222240302

Preneel, B. (2011). Hash functions. In H. C. van Tilborg, & S. Jajodia (Eds.), Encyclopedia of cryptography and security (pp. 543–545). Berlin: Springer Science & Business Media. https://doi.org/10.1007/978-1-4419-5906-5_580

Proof of Existence. (n.d.). [Website]. Retrieved from https://proofofexistence.com

Seo, J.-W., Kim, D.-K., Kim, H.-C., & Chung, J.-W. (2007). The algorithm of sharing incomplete data in decentralized P2P. International Journal of Computer Science and Network Security, 7(8), 149–153.

Singhal, B., Dhameja, G., & Panda, P. S. (2018). Beginning blockchain: A beginner’s guide to building blockchain solutions. Berlin: Apress. https://doi.org/10.1007/978-1-4842-3444-0

Suzuki, K., Tonien, D., Kurosawa, K., & Toyota, K. (2006). Birthday paradox for multi- collisions. In M.S. Rhee, & B. Lee (Eds.), Information security and cryptology – ICISC 2006 (pp. 29–40). Lecture Notes in Computer Science, Vol. 4296. Berlin: Springer. https://doi.org/10.1007/11927587_5

Thomsen, S. S., & Knudsen, L. R. (2009). Cryptographic hash functions. Technical University of Denmark, Kongens Lyngby.

Truu, A. (2010). Standards for hash-linking based time-stamping schemes. University of Tartu, Estonia.

Tsudik, G. (1992). Message authentication with one-way hash functions. In IEEE INFOCOM ’92: The Conference on Computer Communications (pp. 2055–2059). Institute of Electrical and Electronics Engineers (IEEE). https://doi.org/10.1109/infcom.1992.263477

UK Government Chief Scientific Adviser. (2016). Distributed ledger technology: Beyond block chain. London: Government Office for Science.

Vega-Redondo, F. (2003). Economics and the theory of games. Cambridge, UK: Cambridge University Press. https://doi.org/10.1017/CBO9780511753954

Walch, A. (2017). Open-source operational risk: Should public blockchains serve as financial market infrastructures? In D. L. Chuen, & R. H. Deng, (Eds.), Handbook of blockchain, digital finance, and inclusion, volume 2 (pp. 243–269). Cambridge, MA: Academic Press. https://doi.org/10.1016/b978-0-12-812282-2.00011-5

Waldrop, M. M. (2016). The chips are down for Moore’s law. Nature, 530(7589), 144-147. https://doi.org/10.1038/530144a

Wall, L. D. (2018). Some blockchain challenges. Atlanta: Federal Reserve Bank of Atlanta.

Wood, G. (2014). Ethereum: A secure decentralised generalised transaction ledger. Retrieved from https://gavwood.com/paper.pdf

Wood, G. (2020). Ethereum: A secure decentralised generalised transaction ledger: Petersburg version. Retrieved from https://ethereum.github.io/yellowpaper/paper.pdf

Downloads

Published

30-06-2020

Issue

Section

Research Articles

How to Cite

De Villiers, J.E. and Calitz, A.P. (2020) “A Supplementary Tool for Web-archiving Using Blockchain Technology”, The African Journal of Information and Communication (AJIC) [Preprint], (25). doi:10.23962/10539/29194.
Views
  • Abstract 242
  • pdf 117