Managing And Securing Unstructured Data

1417 Words6 Pages
With a massive growth in Internet, there has been an enormous increase in the amount of information generated and shared in social networking sites and across various industries. This has led to storing and access issues with regards to unstructured data. Firstly understanding what unstructured data is of primary importance before trying to handle it. In simple terms unstructured data can be understood as data that can’t be stored in the form of rows and columns. It can be anything including email files, text documents, presentations, image and video files. Studies carried out by IDC and EMC forecasts that data will grow to 40 zettabyes (1 ZB = 1 billion TB). As of now more than 80% of all stored data in organizations is unstructured and…show more content…
Also backup and restore will take a lesser time Major disadvantage is that storing unstructured data will significantly increase the size of databases which will lead to more time for backup and restore of these databases and can cause performance issues with I/O subsystems The big disadvantage of this approach is that we have to create and maintain manual links between database and external file system files which can potentially go out of sync. Also since data is stored outside the system the backups are not consistent and unstructured data is not a part of the transaction. Hybrid Approach: To overcome the disadvantages of the methods discussed above, another method was the Hybrid Approach, which proposes Database engine to support a new data type “filestream”. Filestram help consolidates the profit of getting to BLOBs specifically from the NTFS document framework with the referential integrity and simple entry offered by the conventional social database engine. In SQL Server, BLOBs can be standard varbinary (max) information that stores the information in tables, or filestream varbinary (max) protests that store the information in the document framework. The major advantage of this approach is BLOB’s are under database transactional consistency. By carrying out research with different datasets, it was clearly observed that hybrid and filestream data type approach is faster
Open Document