Assistant Professor Shantanu Sharma and Ph.D. student Komal Kumari in the Ying Wu College of Computing’s Department of Computer Science have devised a solution to secure document systems beyond standard encryption. The research has been published in VLDB Conference (Very Large Data Bases), a premier forum for data management, scalable data science, and database researchers.
The cloud has revolutionized the ability to store large amounts of photos, medical records, business materials, and myriad other types of documents on which the world depends daily. It is, however, also common knowledge that the cloud is not fully secure, and although encryption provides a level of security against hackers outside an organization, it cannot establish access control permissions inside the system.
The researchers’ paper on Doc★ provides an approach that replaces access permissions based on file IDs – unique numbers for every single file – with permissions tied to keywords. This not only enhances security but expedites the ability to scale thousands of files by updating a single rule, using meaningful keyword tags such as “urgent,” “finance,” or “confidential,” among many others.
Komal explained: “Think of a hospital. Nurses need patient charts, doctors need lab results, and billing staff need invoices. With everyone having different permissions, current encryption methods alone cannot provide strong access control.”
She further pointed out that permissions were also previously set one file at a time. “This sounds fine in some circumstances but imagine setting permissions for visiting doctors for large numbers of files by hand, and it becomes a nightmare,” she added.
The question of how you use this keyword solution without leaking any information arises while established keyword-based access permission. Doc★ uses a mathematical trick called Shamir’s Sharing.
Sharma described it as “…cutting a puzzle into pieces and storing each piece on different non-colluding clouds, such as AWS, Azure, GCP. No single cloud has the full picture. Even if hackers break into one, they get nonsense.”
Doc★ runs in three phases:
Individuals check if they are allowed to search for a keyword.
Find the file-IDs that are associated with the keyword.
Retrieve only those files that do not contain a keyword to which access is denied.
All of this happens without servers ever knowing your keyword, the files, or your access rights.
According to the researchers, moving from clunky file-by-file rules to keyword-based policies, Doc★ makes the cloud both easier to use and harder to break.