FUJIFILM & JEITA News Letter
What is the Storage Chosen
by One of the Greatest Research Institutes?
CERN (European Organization for Nuclear Research) is one of the greatest research institutes. It has two research districts in the region across the border of Switzerland and France. There is a large hadrons' collider (LHC) with a diameter of 27 km in the basement. Higgs particles were found in the institute. HTML, HTTP, and World Wide Web which is now indispensable were also invented here. The era of Internet, today’s Big Data, and future’s IoT have never existed without these inventions. They are undoubtedly great inventions of the century.
LHC generates 1petabyte per Second
CERN's LHC generates huge data as much as 1 petabyte per second. It is needed to record the data at high speed and to save it safely for a long term. Saving data with low energy and cost is more necessary. The data of the experiment in the past cannot be reproduced again because it takes huge cost to do it. What storages do they use?
What is the Best? and Why?
What they chose as a storage to store valuable data is magnetic tape. They appeared in 1951 which is more than 60 years ago, and are the first digital recording medium in the world. Why did they choose “old technology”, though the theme of their research is on the most advanced stage? CERN engineers tell four reasons as below.
1) Low Cost, Low Power Consumption
Magnetic tapes are totally cheap if considering not only the product cost but also electricity charges. When you do not use tapes, it takes no cost to keep them unlike the other medium. That is important topic for people who need to preserve a tremendous volume of data for a long time to keep costs down.
Large amounts of terabyte class data are lost if the disk is damaged. In the case of tapes, the lost data volume is limited to only some gigabytes even if a part of stored data cannot be read. Moreover, it is said that the tape can store data for more than 50 years. Unfortunately, it cannot be said that disks are suitable for storage to store data for a long term.
Most of the data losses in the world is said to occur due to operation mistakes, hacking, or manipulation by employees of organizations. It means that it cannot be prevented previously . However, it will take years to erase all of the large volume of data that CERN stores in tapes. On the other hand, it takes a few seconds if they store in disks.
4) High speed
Generally, since the tape is a sequential device, a lot of people misunderstand that the recording speed of tapes is slow. Yet, the opinion means not throughput (speed of actual transferring data) but latency (time to data access). Throughput is extremely fast. In addition, the tape uses a technology that heads checks the data immediately after the other head writes the data. You can check whether it was successful to be recorded or not in real time.
In this era of big data, it is said that there are large volume of digital data as much as 44ZB in 2020. How much volume is preserved for a long term? It is considered that a lot of original data, especially unstructured data, is preserved semipermanently. Data as a Service (DaaS) started up recently. Then, the impact is not limited to the public cloud service. Many companies may benefit from the situation and some of them can innovate or develop business with their own data. How far is the volume of data swelling? How about referring to the wisdom of the world's most intelligent people who have already succeeded in preserving hyper scale data at low cost, long term, and safety?