Data Backups and Cloud Computing
Data Backups and Cloud Computing can be treated as two separate subjects in one context and can be considered as complementary in another context. Data backups have been common since the inception of computers. Cloud computing, on the other hand, is a relatively recent phenomenon. The cloud infrastructure helps with effective data backups and low-cost disaster recovery option. Data backups help in ensuring restoration of data in case of data loss, data corruption, and data integrity issues.
KeywordsCloud Computing Cloud Provider Cloud Environment Public Cloud Cloud Infrastructure
Data backups and cloud computing can be treated as two separate subjects in one context and can be considered as complementary in another context. Data backups have been common since the inception of computers. Cloud computing, on the other hand, is a relatively recent phenomenon. The cloud infrastructure helps with effective data backups and is a low-cost disaster recovery option. Data backups help in ensuring restoration of data in case of data loss, data corruption, and data integrity issues.
As we have seen in previous chapters, “Availability” is one of the important aspects of information security. Data backups are the first line of defense against crashing of systems, corruption of data, exploits leading to data integrity issues, and accidental loss of data. Data backups stem from the fact that the disks on which the data is stored are prone to failures and can lead to a single point of failure. Data backups provide for continued operation by effective restoration of data and assure continued availability of the systems albeit the time taken for bringing up of the system by restoring the data for the corrupted or crashed part of the system. Although over the last few years, the reliability of the hardware and software systems has increased, there are still the risks of crash of the hardware, operating system, applications, and databases resulting in data loss or corruption. Hence, data backups are a must even today. The process of backing up your data has also progressed from manual backups to automated backups using mechanisms such as tape libraries, offline backups to online high speed mirroring, and in-house storage to off-site storage at third-party data custodians. However, one thing that has remained constant is that, even today, you cannot do away with the backups.
On the other hand, cloud computing has brought in new avenues of hope for low-cost use of applications, application development and deployment possibilities, and infrastructure acquisition. The cost effectiveness is evident in not only the reduction of the upfront investment, but also the reduction of financial burden in the pay-as-you-use model. Also, you can add or reduce computing power and storage based on the changing needs of your business. However, the cloud computing phenomenon also elevated the issues related to security and privacy. Even though most of the issues related to privacy and security are applicable to other platforms or applications too, there are some specific, additional privacy and security issues that arise with using a cloud.
Need for Data Backups
As we discussed above, even in today’s world, the value of backups cannot be discounted and they are still the first line of defense against data loss, data corruption, or system crashes. Today, most documents and records are stored in computers and hence, if something happens to them, they could be completely lost forever without the correct backup system in place. Consider the following situations: A property has been registered in your name and the registration process was entirely handled through a computer. Now you find that the computer through which the registration was carried out has crashed or the server in which the registration details were held has crashed and the entire details were completely wiped off.
Similarly, imagine you concluded a big deal with another corporation electronically. The related computers crashed and the documents were lost or the bank’s server and the database thereon has crashed, causing complete loss of all online transactions of the bank throughout the day.
In all these situations, unless the parties have other means of demonstrating / verifying that those transactions took place (such as through e-mails or written documentation), it will put the parties involved in very difficult circumstances. Nothing can replace the original information or transaction; thus, backups are the best and primary means of ensuring that the data is restored back quickly and effectively.
Not having backups is dangerous for the organization. Most of the transactions are initiated on the internet, pass through different network equipment, and are culminated in the relevant servers. Proof of such transactions may not be maintained by the individuals. Suppose I order a $1000 computer online but I do not save the order onto my system. The amount gets debited to my credit card but the order gets wiped off as the server of the online store crashes and they do not have a backup of my transaction. The only way I can convince them is through the debits to my credit card account. But, if they have lost the transactions from their server, my laptop may not be delivered to me even after 30 days because they have to verify that this transaction really did take place and they have received the amount from my end. Having a backup would have helped them restore the data and later ensured that the deliveries are made as promised.
Backups protect us against the availability and integrity aspects of information security. If the database gets completely corrupted, only the backups can make the availability of the corresponding systems possible. Similarly, if data integrity is compromised, only the recent backup enables us to bring back the application to its last state. As seen from these discussions, backups are an integral part of the information technology infrastructure which cannot be overlooked.
The various availability options like RAID (Redundant Array of Inexpensive Disks), Server Clustering, Electronic Vaulting, Remote Journaling, and Database Shadowing provide further alternative / complementary options to what backups can do and facilitate concepts like Online Mirroring and Hot Alternative Sites which are important constituents of Disaster Recovery and Business Continuity. These aspects have to be looked into, along with the backups, as an integral part of backup strategy. These are detailed as appropriate in subsequent sections.
Types of Backups
Backups can be categorized in different ways, as described in the following sections.
Category 1: Based on current data on the system and the data on the backups
The way backups are taken and the time delay between the current data and the data on the backups is one of the ways of such categorization. On the basis of this time delays, the backups can be categorized as Online Backups, Near-line Backups, and Offline Backups.
Online backups are taken in real time, and provide for high redundancy and fault tolerance. RAID Level 1, RAID Level 15, and RAID Level 51 are examples of online backups that provide high redundancy. Server Mirroring, Remote Journaling, and High-Speed Online Mirroring are other ways of enabling online backups and provide for high redundancy which does away with the single point of failure. These online backup mechanisms are required in highly critical systems where you cannot afford to lose even a small fraction of data, such as banking, data centers supporting various organizations, or organizations where different portions of the same work are being carried out by different centers.
Near-line backups are the backups taken at near real time but not at real time. There is a gap between the current system data status and that on the backup device. The electronic vaulting concept can enable this, if the batch transfer of data to another system or offsite alternative system can be carried out frequently. Electronic vaulting has been explained in detail in a subsequent section of this chapter. Often, offline backups are considered to be near-line backups.
Offline backups are the most common form of backup used. These backups are taken on tapes or external hard disks or other media as relevant. Individual system’s data can be backed up to external hard disks. Servers and other systems with huge data stored on them may be backed up on tapes using automated tape libraries on which the backups are scheduled. Sometimes backups taken when the systems are offline (not being used) are considered as offline backups, such as a backup taken when no transaction was happening on the database or the database was in offline mode.
Category 2: Based on what goes into the backup
Another way of categorizing the backups is on the basis of what goes into each backup and include full backups, incremental backups, and differential backups.
As the name suggests, full backups back up the entire system and are generally on external media like tapes or external hard disks. These consume a lot of time to back up depending upon the quantum of data to be backed up. Normally, these are taken during the weekends when there are hardly any transactions or very low number of transactions so that they do not adversely impact the performance of the system being backed up. This also ensures that most of the files are backed up. Most of the organizations take these backups weekly but occasionally businesses also conduct a full back up monthly or annually. Weekly backups are usually overwritten the following month. Once the backup is taken, the archive file attribute is set for all the backed up files to enable the system to know that these files are backed up.
Incremental backups complement the full backup, but only the files that have changed subsequent to the full backup are backed up as incremental backups. For example, on the Monday following a weekend full backup, only file Z has been changed. So during the incremental backup that occurs Monday night, only file Z will be backed up. Suppose on Tuesday, another two files X and A are changed, only these two changed files are backed up during the incremental backup process. Then again on Wednesday, if the file B is changed, only that file is backed up during the incremental backup process. This process will be carried on until the next full backup is taken.
Differential backups also complement the full backup, but again only the files that have changed subsequent to the full backup are backed up as incremental backups. However, the difference between incremental backups and differential backups is that differential backups taken on Monday (taking the above example) will back up only file Z whereas on Tuesday it will back up the files Z, X, and A (i.e., all those files changed on Monday and Tuesday or subsequent to full backup). Accordingly, on Wednesday all the files (i.e., files Z, X, A, and B) will be backed up during the differential backup process. Here, the archive file attribute is not reset after the backup of these files is carried out so that they are backed up on the subsequent day also. This process will be carried on until the next full backup is taken.
Category 3: Based on storage of backups
On-site Storage: Backups are stored / preserved on site. Such backups may be on tapes or external hard disks or another backup system where the backup files are dumped.
Off-site Storage: Backups are stored off site. These off-site storage backups may be at some other office / branch of the same organization or with specialized records and data custodians.
Category 4: Based on the extent of the automation of the backups
Highly automated backups: These are through online backup mechanisms like RAID Level 1, RAID Level 15, RAID Level 51, and Server Mirroring.
Manual Backups: These are scheduled and automated through mechanisms like tape libraries.
These are manually taken as per a calendar of backups. For example, the month end, year end, and weekend calendar will specify the full backups, and daily calendars will specify the daily incremental or differential backups as per the backup strategy of the organization.
RAID Level 0: This level uses the technique known as Striping and stores data on one large virtual disk consisting of several physical disks. It spreads the data onto several disks. Even though this improves the performance, it does not create redundancy. If any of the constituent disks fails, the entire system fails.
RAID Level 1: This level uses the technique of online mirroring. As transactions happen, while the application writes the data onto a disk or a set of disks, it also mirrors the same data onto another disk or another set of disks. This provides for complete redundancy in the sense that if one disk fails, the system automatically switches onto the corresponding mirrored disk. This is a costly option but works well for critical systems that have zero or minimal tolerance for downtime.
RAID Level 2: This level is hardly used now.
RAID Levels 3 & 4: RAID Levels 3 and 4 are similar except that Level 3 is implemented at byte-level and Level 4 is implemented at block-level. While data is written to several disks constituting one large volume like in Level 0, the parity bit is written to a separate parity drive. This enables redundancy as the data can be reconstituted using the information from the parity drive. However, the risk here is that if the parity drive crashes the entire redundancy is lost. Further, as the parity information is written onto a single parity drive, performance is negatively impacted.
RAID Level 5: This level is almost similar to RAID Levels 3 and 4 except that there are many drives onto which the parity check bits are written, rather than onto a single parity drive. This uses the technique of interleave parity. This provides for higher redundancy, lower single point of failure and higher performance. The crashed disks are possible to be swapped online while the system is still working without the need for the system to be brought down.
RAID Level 6: This level is the same as RAID Level 5 except that this uses two-dimensional parity checks.
Various other RAID Levels have been designed as necessary. Popular among them are RAID Level 10 which uses both the concepts of mirroring as well as spreading or striping of the data across multiple pairs of disks; RAID Level 15 which uses mirroring technique of RAID Level 1 with interleave technique of RAID Level 5; and RAID Level 51 which ensures that all the disks including the parity information is mirrored.
Other Important Fault Tolerance Mechanisms
Server Clustering: A group of servers are clustered to provide high performance as well as provide redundancy. While in good health, all the servers complement each other to provide a better performance. But when one of these servers crashes, the others continue with the work, thus appearing the same as usual to the users, albeit with a reduced performance, which may not be perceivable by the end users most of the time. This system balances the load as well as provides for redundancy in case of failures of individual servers. Server Clustering is illustrated in Figure 13-2.
Electronic Vaulting: Here the data is transferred to an alternative system (usually at an alternative site) using the batch mode.
Remote Journaling: This is almost similar to Electronic Vaulting except that the transfer of data is not carried out using batch mode, but is done online as and when the transactions happen, so that the alternative site can take over at any point of time if the main site or main server fails, either due to natural disasters or because of other reasons. These sites are connected usually through high speed links to keep both these sites always in sync.
Server Mirroring: Here another server known as the secondary server is deployed. This will help in taking over the running of the system if the primary server fails. The data is usually mirrored between these two systems (i.e., the primary server and the secondary server) using high-speed links. It is possible to roll over to the secondary server without the users being aware of it or the users may be redirected to the secondary servers when the primary server fails. This depends upon the strategy of the organization. Server Mirroring is illustrated in Figure 13-3.
Role of Storage Area Networks (SAN) in providing Backups and Disaster Recovery
Storage Area Networks (SANs) can complement traditional backups as they provide both for backup and recovery. These are high speed, special purpose network devices. They provide for disk mirroring and enable sharing of the data between different servers, thus enabling effective disaster recovery in case of failure of one of the servers or by providing for replication of stored data. Further, the data can be easily migrated from the SANs to other systems. SANs may be relatively costly, but offer an effective and efficient mode of backup and restoration, as well as disaster recovery capability.
Cloud Infrastructure in Backup Strategy
Cloud infrastructure provides for an easy and low-cost backup option to organizations as the cost of public cloud infrastructure is low and continues to decrease as more players enter the field. Organizations need not invest highly in backup hardware, software including tape libraries, tape media, tape storage space, off-site tape storage custodian services, and backup personnel. They now have the option of backing up or replicating the data on to the cloud infrastructure. However, to be successful, the organizations require substantial bandwidth and network speed to connect to the cloud. Further, the correct replication and data backup has to be verified periodically to ensure that the backups of the data are being carried out appropriately. Any issues identified have to be promptly addressed or the data may not be replicated or written to the cloud in case of network connection failures, in which case you need to manually sync the data between the clouds and the current data on the systems. Further, you need to ensure that the backups on the cloud are protected from malicious attacks.
Most of the database systems have provisions for replication of data, physical database backups, and logical database backups. They also provide for cold (offline) and hot (online) backups. Each type of backup has some advantages and some disadvantages. The organization has to effectively study the backup methods available, for each database management system that is deployed and enable only those methods which suit the organization and its backup objectives well. The criteria for such a decision depends upon the size of the database, the transaction load, the criticality of the function performed by the database system, and whether the database has any idle period. Similarly for appropriate data recovery, the methods suggested by the database vendor have to be used.
Each organization has to decide on its backup strategy based on the systems it has, the criticality of its business, the criticality of its functions, the criticality of the data it has, and the data loss tolerance an organization has.
Some organizations like banks, financial institutions, e-commerce sites, and online reservation / booking systems may not be able to lose even a single transaction or a portion of the data with respect to their critical systems. If looked at from the availability aspect, this can adversely impact the business of the organization in a significant manner or if taken from the perspective of the integrity of the system, this can bring down the reputation of the organization or can lead to serious confusion. Hence, high availability may be a requirement in those cases.
An organization with no tolerance for data loss or loss of integrity of data must implement such strategies as online mirroring using the relevant RAID technologies, Server Mirroring, or Database Shadowing. Mirroring or Remote Journaling to an alternative hot site can enable a highly critical business to resume business from an alternative site without any business lag.
Again the backups cost money. The cost components are that of backup hardware, backup software, backup media, personnel involved in taking the backups, backup storage costs including the physical and/or logical space used, backup restoration verification costs, and backup offsite storage costs. Hence, the backup strategy has to be worked out by comparing the benefits of backup with respect to the costs. This is not to mean that you need not have backups. But, investments in backups have to be commensurate with the risks the organization is trying to avoid.
They conduct weekly full backups using tapes. These are normally completed on the weekends. These tapes are normally recycled the following month during the corresponding week.
They also conduct monthly and/or yearly backups which are preserved perpetually.
In addition to this, they conduct the differential or incremental backups during the week days. The tapes used for the incremental or differential backups are recycled in the subsequent weeks.
Typically backup software (e.g., Acronis True Image, Genie Backup Manager, Symantec Backup Exec – Recovery Manager, etc.,) is used for conducting a backup in a reliable way. There are also many good backup managers available for free. The important part of the backup process is the verification process which checks that the backup has been completed successfully. The backup software should also throw up messages to notify the system administrators about errors, if any, so that these can be handled effectively.
Backup reliability is further ensured through periodic checks on the backup media for their continued usability, as these media are also likely to be prone to faults / issues due to ageing/repeated use or due to environmental conditions to which they are exposed to. As backup media are written on again and again over a period of time, every organization should understand the reliability of such media and for how many ‘write’ cycles it can be effective or for how long it can be used. This leads us to an effective retirement of aged / faulty media and a program of planned restoration of the media to test the backup and the said media’s continued usability.
Primarily, even though backup is that of data which changes as the systems or applications are used, it may also be of system software or application software. Backup may not be effective by simply copying the files. Different types of data, like databases or configuration management system data may have to be backed up using a different method as specified by the corresponding database or utilities, tools, or vendors. Additionally, some of these backups may have to back up the logs internal to these systems. These aspects have to be kept in mind to ensure that the backups are useful for restoration when required.
Restoration from backups may not be effective if the media is not readable when required. It may not be effective if all the files were not backed up during the backup process. This may be due to a faulty backup or media fault. Hence, every organization needs to have a planned restoration strategy to understand the effectiveness of its backups.
The above strategy should be planned and may be carried out on a quarterly basis by identifying which types of backups and which media will be checked into for restoration during this planned exercise. This ensures the following benefits, such as checks for continued suitability and effectiveness of the media used, the continued effectiveness of the backup method, and completeness and correctness of the backups. This strategy should ensure that all the backups work when required for restoration/recovery.
Restorations are also carried out on user requests. Normally a user may accidentally delete a file. Many of us have made a similar mistake. Of course, sometimes recycle bin can come to our rescue but we may realize later that we have accidentally deleted a file that even recycle bins would have emptied. Sometimes, the files may be overwritten by mistake. It is also possible that the file was corrupted because of system issues like abnormal behavior of the system, sudden shutdown of the system, or because of a virus. The only way to restore such files is to go back to the system administrators and request them to restore the files from appropriate backups. This may require the system administrators to go back in the history and locate the appropriate backup file.
If during the week, a file has been accidentally deleted, overwritten, or corrupted, then the file may be possible to be restored through the full backup of last weekend or differential / incremental backups of the current week days depending upon when the file was last modified. If the file was last modified prior to the previous weekend, then the previous weekend’s full backup should help. If it was modified during this week, then the differential / incremental backup as applicable should help. If a file created last month was found to be untraceable now and was possibly deleted during this month’s first week’s abrupt system corruption, the system administrators may have to trace back for the file in the last month’s month-end full backup.
Online backups also have to be audited periodically for effectiveness of the backups, including the completeness and correctness. In my experience, backup tools have confirmed that the backups have been completed successfully, but there have been no files physically found on the tapes. This is often due to defective tape.
Important Security Considerations
Backups have to be in encrypted mode (at least for corporations) so that they are not tampered with or misused either by internal employees or by external off-site data custodians or by others during data transmission between on-site and off-site.
Backups are never missed for any reason.
The completeness and the correctness of backups are verified invariably and reliably by the backup process so that the backups are useful when required to be used.
Backups have to be protected against fire threats by preserving them in fire proof cabinets when they are on-site. Similar care has to be taken by the off-site data custodians.
Backups have to be protected from adverse environmental impacts, such as leaked water, strong electromagnetic field, fungus, and dust.
Backups have to be protected through appropriate handling from issues such as static electricity (in case of backups in external hard disks, etc.), vibrations / shocks, and dropping.
Backups have to be securely transferred from on-site to off-site and vice versa.
Wrong overwriting has to be avoided by appropriate labelling.
Backup media have to be accounted for so that none of the media are missed; backups are stored and arranged appropriately so that they can be traced easily.
Some Inherent Issues with Backups and Restoration
Most of the backups on tapes and other media (other than through online mirroring and high speed storage area networks) are very slow and take significant time to complete the backup. Similarly, these types of backups on tape and similar media also take significant time for restoration. Your disaster recovery efforts have to factor in this restoration time.
Most of the backup strategies, other than online instant mirroring and high-speed storage area networks, may not be able to restore the data completely in case of data corruption or data loss as there is always time lag between the last backup and the server crash leading to data corruption.
In case of server crashes, including operating system crash and / or application crash, it is not enough that you have data backup but must also have a clean (non-infected and genuine) copy of the operating system and / or a clean (non-infected and genuine) copy of the application software. You also need to ensure that before the data is restored, the system is brought up-to-date by applying relevant updates / patches both to the operating system and / or to the application, as relevant.
While restoring, ensure that appropriate data backups are selected and restored from. Otherwise, it will lead to an unnecessary waste of time; particularly after restoration (which takes significant time), you find that you have restored the data from an old backup and not from the one which you were supposed to restore. Furthermore, it is very important to internally / externally label the media appropriately, so that there is no issue in correctly identifying them.
Best Practices Related to Backups and Restoration
Have a written Backup Policy and Strategy which is understood by all, particularly by the system administrators who are responsible for the backups
Make it clear to the users as to what information they need to pass on to the system administrators to ensure that the backups do not miss critical systems, files to be backed up (as some of the systems may be deployed at the departmental level too like HR or Finance or some of the directories or file backups are carried out only on a request basis).
Have the Backup Policy and Strategy after considering the criticality of the business data and systems and after weighing the benefits and the costs of backups. This also takes into consideration the tolerance of the business to the down-time of the systems or adverse impact on business of not having system availability.
Once the Backup Policy and Strategy is decided, ensure that it is executed without fail. If the systems crash and you lose the data and realize that the backup has not been taken for the past month, your backups taken earlier will be useless and you may not have much recourse.
Have a periodic restoration plan (like quarterly restoration plan) that specifies which backups have to be restored and which media have to be tested. Again ensure that these plans are invariably executed and the outcome of these plans is recorded. If any media is found to be not readable then it has to be substituted with a good media. Usually the entire restoration has to be carried out or the files selected by the data owners have to be restored to ensure that the backup is effective and restoration is possible.
Backup policies and strategies are to be revisited and modified as appropriate whenever new systems or applications are added to the organizational IT infrastructure.
The Backup Management Software has to be configured and used appropriately. The errors thrown up by the backup management software have to be handled appropriately.
Periodically, the backup media have to be reconciled to ensure that no backup media are missing. If any of the backups are missing, it may lead to an information security breach, as the data can be misused by others if it is not in encrypted properly.
The transfer of the data backups between the on- and off-site storage facilities should be carried out securely. Usually locked boxes with the key available at the other centers are used for such transfers.
Media have to be recycled periodically to avoid risks posed by expiration.
All the media have to be appropriately labelled to understand what it contains, when it was last written on, and a media number to enable the traceability of the media at on-site or off-site facilities.
Critical backups are to be held in an appropriate environment that is not exposed to high heat, high humidity, or high magnetic field as relevant to the media. The critical backups are to be held in fire-proof cabinets to protect them from fire disasters.
Permanently archive, into a reliable external media, the currently not required files or files not required for future usage but which may be required from the perspective of the organizational data retention policies because of legal and audit purposes and held securely, thus eliminating the need to unnecessarily store them in the current systems and unnecessarily back them up regularly.
Carry out regular periodic audits of the backup and restoration practices and identify the weaknesses in the current backup policies. Ensure that the policies are modified appropriately to address the issues found.
Introduction to Cloud Computing
Clouds or cloud computing are the hotly debated topic of late, drawing divergent views, with many people strongly supporting them because of their benefits. At the same time, many oppose them or not accept them because of the apprehensions related to security and privacy. Cloud technology itself is evolving and many of the aspects are yet unclear. Visibility into many of the cloud infrastructure and significant transparency related to them is yet to be achieved. All these have led to prime concern among the organizations that want to go on to cloud but are holding back because of these concerns.
Of course, the world is excited about the prospects of cloud computing. Individuals, enjoy the benefits of cloud computing as they are not very worried about the security concerns or may not be aware of the security issues as they presume that the cloud provider, being a professional entity, would have taken care of all these aspects. However, corporations and legal entities must weigh the benefits against the risks cautiously as they are accountable to their shareholders, customers, and partners. Despite the hype around cloud computing, it has not become as popular as expected as many experts are highlighting the security, privacy, and other concerns which need to be addressed effectively.
What is Cloud Computing?
In lay terms, cloud computing is the provision to use applications, platforms, and infrastructure from a third party without upfront investment but with the provision to pay as you use, while at the same time, providing for flexibility to increase or decrease the usage depending upon the organizational necessities or requirements. Definitely, the outright benefits are that you need not put up a huge upfront investment to create the infrastructure or purchase the tools for development. You can acquire the rights to use them without any initial investment, but pay for them based on the usage. You can also increase the usage by opting for additional infrastructure, additional tools for development, or an additional number of applications or facilities for adding more users to use the current applications.
The beauty of cloud computing is that the users can make use of the services from the cloud using their web browsers, thin clients, or even equipment like smartphones and tablets. They do not require any sophisticated tools or complicated environment or utilities to utilize the cloud services.
Cloud computing can definitely be a boon for small organizations carrying out non-critical business, where there is little sensitive information and few transactions. They may be okay with the occasional non-availability of connectivity, or occasional breaches of data, as the benefits outweigh the risks. Hence, security may not be of much concern for such smaller organizations. However, they also need to be concerned and cautious about the storage of privacy and personal information related to their customers, as they too have to follow the laws and regulations of the region and cannot shirk their responsibility in this regard. Further, organizations can choose to use cloud infrastructure only for those applications and those data which are not sensitive.
However, most of the corporations who engage with their competitors for better market share, who have lots of confidential proprietary or intellectual property rights information, cloud computing is unacceptable in its current form. The risks related to security and privacy issues including confidentiality, integrity, authenticity, authorization, privacy, and availability need to be weighed against the benefits cautiously. None of these issues can be taken lightly, in spite of the cost effectiveness or the flexibility accorded.
the scope and benefits of the offering, and how it impacts their business, customers, and partners,
the terms and conditions of the offer, including the responsibilities the cloud provider is undertaking and the responsibilities the cloud provider is passing on to the cloud consumer,
the legal implications of the usage of cloud,
the technology employed by the cloud provider,
the transparency provided by the cloud provider,
how the cloud provider fulfils the demands of his consumer base,
how the cloud provider governs or manages the infrastructure,
how the changes are managed, and
how the transition from one cloud provider to the other is possible.
There are many more questions to be evaluated based on the needs of the organization.
Fundamentals of Cloud Computing
What is cloud computing? “Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction”.1
Some of the words in the above definition are very interesting. Most of the words in the definition are self-explanatory except possibly the word “ubiquitous” that means all pervasive or universal. Yes, as we can find the real clouds everywhere, so the cloud infrastructure can be anywhere. While the words “ubiquitous,” “convenient,” “on-demand,” “configurable,” “rapidly provisioned or released,” and “minimal management effort or service provider interaction” highlight the benefits of the cloud, according to our opinion, the words to be watched carefully are “ubiquitous,” “shared pool,” “configurable resources,” and “minimal management effort or service provider interaction” as these also indicate the need for higher security consideration.
Software as a Service (SaaS)
Platform as a Service (PaaS)
Infrastructure as a Service (IaaS)
Cloud Service Models
As mentioned previously there are three different Cloud Service Models: Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS). The details of each of these and the interaction among them are explained in the following section.
Software as a Service (SaaS)
whether to go in for all functionalities planned upfront or to develop and release the product with features in phases;
which technology to go in for – whether cheaper or costlier, whether simple or complicated;
which operating system to be chosen as the platform for the application – whether open source like Linux or the popular one like Windows;
who should be the targeted users; what factors like the usability, scalability, flexibility, response times, maintainability, portability, etc. to be considered (of course, unfortunately, most of the times security as a factor is hardly considered);
whether to deploy the new resources to develop it or to hire experienced people to develop it.
Considering the infrastructure costs, development tool licensing costs, and human resources salary the total investment, often tends to be substantial. Hence, the pricing of the product will also be substantial unless the product is targeted for mass market. Added to that, the required marketing and sales efforts are huge and they cost significantly. At the end of all these, it is difficult to still say whether the product will be successfully sold and if successfully sold, whether it will be a successful product that will continue to be in demand or by the time the product comes out in the market, some other competitor has come up with a better product and may be, with even better pricing.
The cloud-based Software as a Service (SaaS) has become a handy tool for application providers to rent out their software applications by hosting them on the cloud. Through this, they can make the software application readily available to organizations and users on demand. They can also attract more users as the price per user comes down substantially. Further, enabling additional instances of the same application software or installation of software application additionally for a particular organization is very easy for the vendor organization. This brings down the total cost of marketing, sales, installation, and maintenance of the application software. Again, it is very easy for such vendors to patch / update the software applications in case of any security issues or bugs. Further, because of the pay-as-you-use mode which is applied to the customers, the vendor organizations get a steady stream of revenues instead of the ups and down in the revenue realization because of the traditional sales efforts.
For the users, it is cost effective as the cost of renting is comparatively less than the cost of outright purchase. Further, they will pay as they use and not upfront. They can also increase and decrease the number of users as they scale higher thus not locking up funds in unused software application licenses. More importantly, they need not have dedicated servers and other infrastructure to install and use such software applications in-house. In addition, most of the time, the consumer just has to provide the user with a web browser or a thin client interface. The end user platforms can be, for the most part, smartphones or tablets. This makes the entire proposition very attractive to the consumer organizations.
Software as a Service (SaaS) is defined as: “The capability provided to the consumer is to use the provider’s applications running on a cloud infrastructure. The applications are accessible from various client devices through either a thin client interface, such as a web browser (e.g., web-based email), or a program interface. The consumer does not manage or control the underlying cloud infrastructure including network, servers, operating systems, storage, or even individual application capabilities, with the possible exception of limited user- specific application configuration settings”.1
The advantage of SaaS is that the control on the appropriate configuration of applications lies with the cloud consumer. Other than this, most of the control, from the network to the servers to the operating systems, lies with the cloud provider. Some of the examples of SaaS are SalesForce.com, QuickBooks, Zoho Office Suite, justcloud.com, Dropbox, and many more similar applications.
Platform as a Service (PaaS)
PaaS is “The capability provided to the consumer is to deploy onto the cloud infrastructure consumer-created or acquired applications created using programming languages, libraries, services, and tools supported by the provider. The consumer does not manage or control the underlying cloud infrastructure including network, servers, operating systems, or storage, but has control over the deployed applications and possibly configuration settings for the application-hosting environment.”1
Many of the organizations in the IT industry are in the business of developing software frameworks or software application products, utilities, or tools. They require the infrastructure to develop these software applications, design and development tools, and testing tools to test their output. All these are costly and when added to the cost of human resources, add up to a significant cost to organizations. This has sometimes led companies to use pirated developmental tools which in turn have led to bugs in the outputs, security issues, and violation of laws.
PaaS allows organizations to rent the design, development, and testing platforms, software, and utilities at significantly less cost and often in a pay-per-use structure of payment, At any point of time, depending upon the need, they can move from one platform to another platform if they change course. This would have been possible only at additional cost in a traditional scenario where the organizations have to purchase the design and development frameworks / tools. Also, the number of resources using such tools can be increased or decreased on an as needed basis. Another significant advantage is that some of the design and development tools which are new and which could not have been used by the organizations because of the exorbitant costs now become affordable to the consumer organization. These advantages are significant particularly for smaller organizations with limited funding and resources.
As mentioned in the definition, PaaS may also be used to deploy acquired applications in addition to the consumer created applications. Here, the application developer does not have much control over the underlying environment except for setting configurations on application hosting environment and full control over deployed applications. All other controls on the entire underlying infrastructure, including the storage, lie with the cloud provider. Sometimes, this may constrain the designers and developers to use the provided infrastructure and thus compromise even though they wanted a better underlying infrastructure.
Some of the examples of PaaS are Force.com platform, Windows Azure Platform, Google Apps, and Google Apps Engine.
Infrastructure as a Service (IaaS)
IaaS is defined as: “The capability provided to the consumer is to provision processing, storage, networks, and other fundamental computing resources where the consumer is able to deploy and run arbitrary software, which can include operating systems and applications. The consumer does not manage or control the underlying cloud infrastructure but has control over operating systems, storage, and deployed applications; and possibly limited control of select networking components (e.g., host firewalls).”1
Every organization usually invests heavily in data centers wherein servers are installed, servers are connected through networks, desktops and laptops are installed with the corresponding client software, other infrastructure like physical security, logical security including access control is implemented. The required support infrastructure like IT Managers to System Administrators to support vendors to support systems like generators, UPS, and other backup mechanisms are deployed. All these come at a significant capital expenditure upfront and even more cost to maintain.
IaaS comes in handy here as it eliminates the investment in the infrastructure in data centers, servers, high-end network equipment, storage equipment, and supporting utility infrastructure including physical security like access control. Again, the pay as you use comes in handy to the users. In the traditional scenario, once the investment is made by the organization, it is stuck to it even when the business decreases whereas in the cloud infrastructure scenario, it can requisition additional infrastructure on an as needed basis and release unused or additional infrastructure if the business decreases. This provides for significant financial leverage to the organization. Further, you need not have sophisticated personnel to maintain your IT infrastructure, physically and logically support your systems. You are also less likely to be hit by malicious infections as you have very lean IT infrastructure on your premises. Further, most of the infrastructure management headache is taken over by the cloud provider. On the other hand, if the cloud gets a malicious infection you are hit significantly as such infection can spread quickly to your internal organization unless the cloud provider has significant security controls built into the infrastructure.
With IaaS, organizations have very limited control over the underlying core infrastructure except the operating systems, the applications deployed thereon and related storage most of the times. Virtualization has enabled the entire game of cloud infrastructure (IaaS) provision. A single physical server (i.e., hardware) is logically partitioned to many different virtual servers and provisioned to various consumer organizations depending upon their requirement. A high number of servers are horizontally deployed to provide for large requirements of customers, at the same time, catering to the smaller requirements from smaller organizations, thus building a huge capacity by the cloud provider.
Basically, in this type of cloud provision, the control over Virtual Machine Monitor (VMM), underlying hardware (i.e., physical infrastructure), and network infrastructure lies with the cloud provider. However, depending upon the cloud deployment models chosen by the organization, controls that are available with the consumer organization may vary substantially.
Some of the examples of IaaS are: Amazon Elastic Compute Cloud, RackSpace Cloud, Eucalyptus, and GoGrid.
Important Benefits of Cloud Computing
Cloud computing offers plenty of important benefits. We discuss just a few of them here in the following section.2
UpfrontCapital Expenditure (CAPEX) versus Pay as you use Operational Expenditure (OPEX)
In the cloud environment, the consumer organization is not required to make huge capital investments or payments upfront. The payment is made by the consumer organization depending on the services it needs and hence the payment is staggered and is made periodically, allowing for significant financial advantages to the organization.
Elasticity or Flexibility
The organizational business and transaction load increases or decreases over time. In the cloud environment, the consumer organization has the flexibility to increase or reduce the usage, demand additional infrastructure, storage, or application usage as their needs change. This flexibility eliminates the need for the organization to provide for the infrastructure to cater to the peak loads or predicted increased business. Further, changes can be made by the consumer in the cloud environment (often without the intervention of the cloud provider) whereas the additional infrastructure ramp up in the traditional way would take substantial lead time.
Reduced need for specialized resources and maintenance services
In the traditional setup, the organization requires expert staff like IT managers and system administrators to run the data centers and relies on outside vendors for maintenance. However, in the cloud environment, most of these are the responsibilities of the cloud provider (of course, the extent depends upon the cloud type used by the consumer organization), which saves the organization time and money.
On-Demand Self-Service Mode versus Well-Planned Time-Consuming Ramp Up
In the traditional setup, the addition of a new facility or capability required significant planning, which is often stressful and necessitated extensive collaboration with internal and external resources. However, in the cloud environment, the additional infrastructure, storage, or other capabilities can be requisitioned on demand, through self-service, without even the intervention of the cloud provider.
Redundancy and Resilience versus Single Points of Failure
Traditional IT infrastructure is mostly prone to single points of failure as most of the applications are deployed on a server and backups are taken in a traditional way through costly mechanisms like online mirroring or server mirroring. In cloud environment, the virtualization across many VMs and the usage of storage equipment with better capability can provide for better redundancy and resilience than the traditional systems.
Cost of traditional DRP and BCP versus the DRP & BCP through Cloud Environment
Traditional IT infrastructure requires elaborate arrangements for DRP & BCP including costly hot / warm / alternative sites, equipped with relevant infrastructure depending upon the business criticality and infrastructure criticality. Further, these have to be revisited and maintained to be effective in case of need. However, cloud environments permit cheaper, more reliable, and easier to implement and use DRP & BCP mechanisms.
Ease of use on the Cloud Environment
Access to and use of the cloud environment is available through simple mechanisms like Web browsers, thin client interfaces, or simple interfaces provided by the cloud provider, whereas traditional client server applications require significant learning on the part of the users for most of the applications.
Important Enablers of Cloud Computing
Internet: Most of the recent advances in technology can be attributed to the Internet which enables connectivity from one system to any other system across the globe. This connectivity has enabled any organization from anywhere to connect to the cloud infrastructure deployed somewhere else.
Network Bandwidth and Reliability: The next important advancement is the availability of higher network bandwidth at a lower cost. This has enabled the organizations to connect to the cloud environment using web browsers and thin clients, yet supporting the required performance across the network in spite of requests being pumped into the cloud and output being pumped down the network channel. Further, high speed and highly reliable network equipment provide requisite support to this end.
Server Virtualization:Server virtualization is the most important enabler of the cloud. The slicing of the physical hardware into virtual machines and enabling different users to use the same physical equipment transparently and without impacting each other is the biggest contribution of the virtualization. Cloud computing would not have been possible without the virtualization. These are assisted by virtual network communications.
Cheaper and reliable equipment: The cost of hardware, including the servers to hard disks to solid state devices to memories to peripherals has gone down year after year. At the same time, reliability of this equipment has gone up significantly. These two have enabled a huge cloud infrastructure to be created at a cheaper cost and made available at a cheaper price to the users.
Standardization: Tremendous standardization efforts have been carried out across the globe to ensure better interoperability of hardware. Protocols have been well defined to enable hardware / communication / network equipment manufacturers to appropriately design their equipment. This standardization effort has led to significant interoperability among different hardware, leading to the possibility of large-scale deployment of the cloud infrastructure.
Advancement in Technology: Advances in technology include: clustering of the computers enabling higher computing power; increased storage capability enabled through network technologies like Storage Area Networks (SANs); high-speed links enabling online mirroring; and Web technologies like Web 2.0 and Web services have propelled cloud computing capability.
Four Cloud Deployment Models
Organizations have different needs for cloud services based on the criticality of the applications, sensitivity of data, speed of servicing the requests, safety related aspects, customer requirements, and security requirements. One organization may have two different types of deployment of cloud models applicable to different segments or groups or for different applications within their organization running at the same time.2&4
The four Cloud Deployment Models are: Private Cloud, Public Cloud, Community Cloud, and Hybrid Cloud. Each of these deployments is discussed in the following sections.
A private cloud is dedicated and set up for a particular cloud consumer based on their unique high-security needs. However, private cloud comes only at high capital investment as well as high maintenance cost. A private cloud ensures high availability, confidentiality of the data, and complete integrity of the data. A private cloud allows the user to realize the benefits of cloud computing, such as avoiding upfront high capital expenditure, pay as you use model, best in class applications, high end technology, and large and fast storage capability without the security risks of a public cloud environment.
Of course, as this private cloud is dedicated to the cloud consumer and designed as per the requirements / specifications of the cloud consumer, it is relatively costly compared to the other cloud deployment models.
The advantage of this model is that the cloud consumer still has significant control over the cloud infrastructure and can negotiate and agree on clear terms and conditions with the cloud provider, including the security level requirements. The security in this type can be demanded by the cloud consumer and can be provided by the cloud provider.
Unlike the private cloud, the public cloud is the cheapest option available on the cloud infrastructure and it is available to everybody. Here, the cloud provider leverages upon the huge infrastructure which he can share with a large number of customers as he deems fit. While these are cheaper, the terms and conditions are mostly unilaterally decided by the cloud provider and the cloud consumer has very little say in this. The cloud consumer does not have much control over the cloud provider. Little significant responsibility is assumed by the cloud provider either on the availability or on the security of the information. Hence, the cloud consumer is at the mercy of the situation or the cloud provider most of the time. A public cloud is best for new organizations with no critical data that are looking for a cost-effective solution. Of course, cloud consumers can request higher service levels and higher assurance but these come at a higher cost and still may not be appropriately serviced. The advantage this provides for the smaller organization is the expertise of the cloud provider which a smaller cloud consumer may not have.
As multiple customers co-exist on the cloud, it may be difficult to monitor the activities of each of these individual customers and hence it is highly difficult and impractical for the most part for the cloud provider to assure high security in this type of deployment model.
These clouds are set up for collaborative work between the organizations belonging to a particular community like health care, government organizations, or social service organizations with shared objectives. These clouds are set up by the community for use by the organizations within the community. These may be managed by one or more of the organizations belonging to the community or may be managed by a designated outsourced agency or a third-party cloud provider. The collaborative set up of such a cloud brings down the cost of such an infrastructure to each organization while bringing the benefits of cloud to all the participants. However, information security may be still a concern as many organizations will be using the cloud and the cloud provider may not have the necessary security expertise. Also, the capability and strength of such a cloud depends upon the capability of those who design and maintain it and the infrastructure they chose to support it.
An organization may use a hybrid cloud, which is a combination of any of the above three deployment models (Private Cloud, Public Cloud, and Community Cloud). Depending upon the different purposes for which the cloud infrastructure is to be used, different deployment models may be used by a cloud consumer. These may be requisitioned from different cloud providers. Highly critical and highly confidential applications may go on to private cloud while non-critical, non-sensitive, generic applications may go on to public cloud and community collaborations may be serviced through community clouds.
Main Security and Privacy Concerns of Cloud Computing
While the benefits are very attractive, security concerns are also significant. Some of the important security concerns are described in the following section.
Various countries have various privacy, statutory, and regulatory requirements. These, when it comes to IT and related fields, are still being strengthened. Nobody can afford to violate the laws of the land (both the laws of the land to where it belongs and from where it operates). Violations of these legal requirements can lead the organizations to severe penalty and may also risk the closure of their business. These may be in addition to the loss of reputation and other risks. For example, think of a situation: US export laws prohibit a nuclear reactor design, high end encryption technology to be exported to trade restricted countries. In a public cloud environment, you may not know where the data will reside. If the capacity is requisitioned from China, your data which is not allowed to be exported to the trade restricted country China, is residing in China on the cloud!! Your employees, your third-party service providers may abuse the cloud infrastructure and use it for their own or unethical or immoral purposes like pornography, racial abuse, or extortion. As the monitoring, either by the cloud provider or by yourself, may be very less, this may risk your organization as a violator of the laws!!
Lack of Segregation of Duties
In case of in-house systems, most of the duties are well segregated. For example: the accounting personnel or payment authorizers are never the system administrators, application developers are not the system administrators, and so on. In the cloud environment, you do not know who is taking care of what responsibility. It is possible that the same resource is responsible for many aspects. This provides or exposes the cloud consumer to a significant risk. Many of these frauds, if committed by such resources, may come to light only very late or may not come to light at all!!
Complexity of the Cloud Computing System2
The cloud computing system is a system of interconnected systems and a large number of components and sub-components. This leads to high complexity. Any infection anywhere or any system or subsystem or component failure because of any reasons, including security flaws or malfunctioning etc., can lead to substantial impact on the availability, confidentiality and integrity of the cloud computing services itself. Further, a cloud computing system is provisioned by a cloud provider by using the services from many other parties. Some part of the entire system would have been further outsourced. This may vary from one cloud provider to the other. The failure or negligence or mistake of one party can also trigger substantial impact on the availability, confidentiality and integrity of cloud computing facility itself. Further, even though significant standardization efforts are being driven, there may still be known or unknown interoperability issues between various systems or subsystems or components. Also, the proprietary application programming interface provided by the cloud provider can also throw up the challenge of understanding and implementing the security.2 All these can put the cloud consumers in a disadvantageous position if any anticipated or unanticipated lapse / failure / disruption / issue happens.
Shared Multi-tenant Environment
The cloud computing environment provides for usage of the same physical environment by multiple unrelated organizations. One consumer infrastructure is separated from the other consumer infrastructure only virtually / logically. With the increased or newer means of malicious attacks being explored and brought up day-in and day-out and any attackers with cloud tenancy, intentionally trying to exploit the loopholes / flaws / errors, they can create significant availability, confidentiality and integrity issues to other consumers. Again, the cloud providers may not be able to monitor all the activities happening within the cloud, thus risking other consumers.
Internet and Internet Facing Applications2
The access to the cloud is through the internet. The availability or non-availability of internet link or degradation of the internet link performance can pose problems for the cloud consumers. These may degrade the performance of access or deny the access to the required services on the cloud. Further, each cloud consumer organization uses many internet facing applications to manage the cloud like configuring the applications or setup on the cloud or to transact on the data on the cloud or to provision for additional services / additional infrastructure etc. One of the important components of such access is Client Web Browsers. Unless these or any other internet facing applications are very secure and maintained very securely by conscious and appropriate configuring / patching / updates, there is a high chance of the cloud consumer being least protected. Further, if the cloud provider himself uses the remote administrative facilities to service clients or to maintain the infrastructure, this also can pose substantial threats to the cloud consumers unless appropriate care is taken by the cloud provider.
Control of the Cloud Consumer on the Cloud Environment
On the cloud computing environment, the control of the cloud consumer depends upon the cloud service model as well as cloud deployment model. The lowest control lies with the cloud consumer when he is opting for SaaS on the public cloud. When he is opting for the IaaS on the private cloud, he has the highest control. The organizational objectives of the usage of cloud computing, organizational security requirements including those imposed on them, by their customers, and through the legal requirements have to be evaluated appropriately and made by the cloud consumer.
Types of Agreements related to Service Levels and Privacy with the Cloud Provider
There may be various agreements signed off between the cloud provider and the cloud consumer. Some of them are Service Level Agreements (agreeing on some service levels assured by the cloud provider to the cloud consumer), privacy agreements, and acceptable use policy (which usually detail do’s, and don’ts on the part of the cloud consumer and their users). Data ownerships also have to be explicitly agreed between the cloud provider and the cloud consumer to avoid any confusions / issues. This is absolutely required to ensure that the intellectual property rights and proprietary information of the cloud consumer organization. There may be many more agreements depending upon the context. In case of public cloud, normally, the agreements between the cloud provider and the cloud consumer provides for unilateral conditions imposed by the cloud provider. These conditions can also be unilaterally modified by the cloud provider at any time. The cloud consumer has to specifically request for and agree on the service levels, privacy requirements and other security requirements at a higher cost if the cloud consumer wants them absolutely. This has to be understood clearly by the cloud consumer organizations and has to be appropriately handled by them. Otherwise, the cloud consumer may be at the mercy of the cloud provider and may not be protected appropriately.
Data Management and Data Protection
In the cloud computing environment, the same infrastructure is allocated / assigned to another cloud consumer when released by the earlier cloud consumer. If the data is not sanitized and cleaned up completely, which can lead to the retrieval of the data through advanced technologies, there is a potential threat of sensitive data theft / misuse by others. Further, a cloud consumer’s data may be concentrated on the cloud, providing for an appropriate attack surface to the attackers. Also, because of the technical complexities, the data isolation between various cloud consumers may be at risk and one cloud consumer’s data may be exposed accidentally to another cloud consumer. This is to be explicitly checked with the cloud provider so that the cloud consumer is assured of the relevant controls.
Insider threats are still one of the significant risks in any system which should not to be discounted. However, it can get accentuated on the cloud computing environment because of the lack of effective organizational control over the services or control mechanisms like review of logs / audit trails etc. On the cloud computing environment, there is possibly more need for the cloud consumer organizations to ensure logging of the user / administrative activities and audit trails to the applications. Further, these logs / audit trails have to be analyzed to understand if there are any insider threats.
Security Issues on account of multiple levels
A cloud infrastructure is built up of multiple levels and work through interfaces between them. This may pose additional threats and to counter this issue, multiple security layers may have to be deployed, as appropriate, to provide for adequate security.
Physical security issues related to Cloud Computing environment
As discussed earlier, the cloud infrastructure is spread over multiple data centers and multiple locations. All these centers and the infrastructure have to be physically protected. A compromise in one of the centers can potentially nullify the physical security controls exercised at other places.
Cloud Applications Security
The cloud applications used by the cloud consumers have to be tested for all the vulnerabilities that may exist like SQL Injection, and Buffer Overflows as these are still relevant on the cloud computing environment. Secure Design and Development practices on the part of the application creators and applications providers have to be checked, ensured either directly or through a third-party assurance provider mechanisms like certification agencies. Otherwise, while you may be saving cost, you may be prone to more issues related to availability, confidentiality and integrity which also risk the reputation of your organization and place your customers in an awkward position.
Threats on account of Virtual Environment
The virtual machine (VM) instances created by the cloud provider to service its cloud consumers may itself create the risks. It may be possible to monitor and spy one of the virtual machines by another. Also, it may be possible to monitor and spy on the virtual machines from the host. Malicious infection of one VM may spread to the other. Data transfer between one VM to the other related VM itself may be attacked. There may be a possibility of a back door on VM affecting the security adversely because of covert channels between guest operating system and the host. Further, the VMMs like Hypervisor and Xen are so complicated that they may have some flaws which may compromise the security of the cloud environment. Bugs, defects, and other security flaws in any component of the host system including the VMM or VM or the guest operating system etc. can compromise the overall security of the cloud computing environment.
Encryption and Key Management
Most of the communications between consumer interfacing tools or mechanisms to the cloud computing environment is encrypted. However, encryption is only as good as the encryption method used and the strength of the encryption keys. If the keys are not strong or if the secrecy of the keys is not maintained, it exposes the cloud consumer to high risks. Further, most of the applications require the data to be decrypted before further processing or computing, which may increase the possibility of exposure of sensitive data.
We have highlighted the above risks which are primarily applicable to the cloud computing environment. This list may not be exhaustive and comprehensive. Further, many other risks, vulnerabilities which are applicable to VMs, Operating Systems, Host Hardware, Networks, and Application software are still applicable to the cloud computing environment and need to be considered appropriately. These may be malware infection threats, man-in-the-middle attacks, sniffing, spoofing, session hijacking, command injection, buffer overflows, credentials hacking, and password cracking. Similarly, defects / bugs and errors in any of these can always expose the cloud computing environment to compromises / attacks.
Some Mechanisms to address the Security and Privacy Concerns in Cloud Computing Environment
The cloud providers, as well as the cloud consumers, are learning from current issues / concerns and current deployments of the cloud computing environment. Many standardization efforts and best practices are being formulated and put in place. However, there are still plenty of risks / vulnerabilities which can adversely impact security, privacy, and legal requirements and affect the cloud consumers significantly. Following are some of the mechanisms which need to be addressed to handle / contain the security, privacy and legal concerns:2
Understand the Cloud Computing environment and protect yourself
Each Cloud Consumer has to understand the services provided by the cloud provider thoroughly, including how the security, privacy and legal considerations are handled effectively by the cloud provider organizations. Further, the data ownership issues, privacy, security, and legal issues have to be explicitly agreed upon through comprehensive legal agreements between the cloud provider and cloud consumer.
Understand the Technical Competence and segregation of duties of the Cloud Provider
The cloud consumer has to make efforts to understand the technical competence of the cloud provider and ensure that they have experienced staff not only in technical expertise related to infrastructure, but also have competent staff related to information security. Further, segregation of duties among the cloud provider staff is very important to ensure that all the powers are not concentrated with the same role, so that there is no possibility of misusing a client’s data for malicious purposes. Such expectations / requirements can explicitly be agreed with the cloud provider by the cloud consumer and made part of the agreement between them. As per our views, this should also be considered by the cloud provider, to safeguard their reputation, even when not explicitly requested by the consumers.
Protection against Technical Vulnerabilities and Malicious Attacks
Each cloud consumer has to be aware of the mechanisms the cloud provider has put in place to understand the technical vulnerabilities, keep track of the ongoing technical vulnerabilities and malicious attacks, and proactively take such actions that are required to ensure security and privacy of the data of the cloud consumer. Another related aspect is that the cloud providers need to be transparent to the cloud consumers, without delay, about the security and privacy breaches observed by them at any point of time. This will enable them to gain the confidence of the cloud consumers. Also, appropriate Incident Response Mechanisms have to be instituted and followed by the cloud providers.
Regular Hardening and Appropriate Configurations of the Cloud Computing Environment
Depending upon whose responsibility it is (depends upon the cloud computing services used and the cloud service deployment model) both the cloud provider and the cloud consumers have to take comprehensive steps to ensure appropriate settings / configurations, hardening of the environment, appropriate design and development, appropriate interoperability, and adequate testing. These should include right from Virtual Machine Monitor to Virtual Machines to Guest Operating Systems to Applications. Further, all cloud interfacing mechanisms, including Web browsers, have to be well configured and well protected against malicious attacks. The cloud providers have to ensure that the back to back commitments from other providers / other vendors, including outsourced vendors / other partners, are ensured so that availability, confidentiality, and integrity of the cloud consumer data including authentication, authorization are achieved.
Cloud providers have to ensure that sufficient steps are built in their systems and procedures to ensure that the data of different cloud consumers are well insulated / isolated from each other and the data is sanitized appropriately when the reallocation of resources are made from one cloud consumer to the other. The cloud consumers also have to ensure that all their valuable data is not consolidated at one place on the cloud so that it becomes a target for the attackers.
Appropriate encryption mechanisms have to be used with strong encryption keys while authenticating to the cloud infrastructure and also while getting the response back. The mechanisms have to be built by both the cloud provider and the cloud consumer to ensure that the encryption keys are held secret. Similarly, the data stored on the cloud computing environment has to be held encrypted. When decrypted for usage or processing purposes, the data needs to be well protected by appropriate complementary mechanisms.
Good Governance Mechanisms
Both the cloud providers and the cloud consumers have to set up good governance mechanisms to ensure that there is visibility, transparency, and mutual trust within their individual organizations and all other stakeholders.
Compliance requirements have to be well understood and well-articulated by the cloud consumer organizations and ensured by the cloud providers. The transparency and visibility to the cloud consumer has to be provided by the cloud provider in this regard, including the location of the storage of the data of the cloud consumer (as applicable). Otherwise, compliance will be a big issue for the cloud consumers.
Logging and Auditing
Logging of various important activities including those of the administrators has to be logged / audited by those who are responsible for the same. It is not enough that these are only logged, but also are analyzed to ensure that there are no malicious activities / attacks already perpetrated or are being planned.
Patching / Updating
All the components of the cloud computing environment, including the host systems to the network components to the operating systems to applications have to be patched and maintained daily by those who are responsible.
Application Design and Development
All the applications which are provided or deployed on the cloud computing environment have to be appropriately designed and developed using secure design and development concepts. All the typical vulnerabilities like non-validated inputs leading to various attacks, possible defects / bugs have to be avoided by secure design, development and comprehensive testing, configuration management, release, and deployment practices.
All the cloud computing facilities have to be physically secured with appropriate measures throughout the cloud computing facilities.
Strong Access Controls
Strong access controls need to be built by the organizations that have their applications and data on the cloud computing environment. Depending upon the risks, these may be through multi-factor authentication or through other relevant mechanisms. Otherwise, security of the applications and data will only be a myth.
There should be well planned backup mechanisms deployed to ensure availability in case of exigencies. They still cannot be done away with.
Third-Party Certifications / Auditing
The cloud providers have to obtain third party certifications or agree for / arrange for strict third-party auditing of their cloud computing environment. The transparency has to be provided to the cloud consumers and strict and immediate actions have to be taken, without fail, on any findings.
We introduced the data backups and cloud computing. We looked into how both these topics, even though they can be treated as distinct subjects, complement each other. The cloud computing environment provides for backups and backups can be enabled through cloud computing. We also highlighted as to how the data backups provide for availability while cloud computing brings out new avenues of hopes as far as cost reduction is considered, etc.
We examined the concept of data backups, highlighted the need for backups, and discussed the dangers of not conducting regular data backups. We looked at how the data backups help us out in case of issues like crash of servers, loss of integrity of data, and corruption of data.
We explored various categorization of the data backups. We differentiated between online backups, near-line backups, offline backups, on-site backups, off-site backups, automated backups, and manual backups. We also looked into some of the enablers of these backups. In this context, we explored RAID, server mirroring, and electronic vaulting, remote journaling. We also highlighted the fact that there may be different methods of data backups required in the context of database systems.
We went on to explore the backup strategy and restoration strategy. We also highlighted the importance of these strategies and what considerations have to be accorded to them.
We then explored further the security considerations related to the data backups and highlighted the important security considerations which need to be invariably taken care of.
We then went on to explore some of the possible issues with data backups and then detailed the important best practices related to backups and restoration.
We introduced the cloud computing concept and then went on to identify and define the cloud computing environment through the delineation of three service models and four deployment models.
We explored each of the cloud computing service models: Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS). We highlighted how SaaS enables organizations to have cost-effective applications, how PaaS enables the access to the software design, development, testing, and deployment tools to the software vendors, and how IaaS provides beneficial infrastructural facilities like servers and storage.
We elaborated on the benefits provided by the cloud computing environment and the enablers of cloud computing.
We explored in detail each type of cloud deployment model: Private Cloud, Public Cloud, Community Cloud, and Hybrid Cloud. We also differentiated between them and highlighted their applicability to various organizations during the course of discussions.
We discussed the important security, privacy, and legal concerns on using the cloud computing environment.
We addressed important security, privacy, and legal concerns.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits any noncommercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this chapter or parts of it.
The images or other third party material in this chapter are included in the chapter’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.