Home > News content

CTO Amazon: I have learned 10 lessons learned in the 10 years of AWS.

via:博客园     time:2016/4/10 20:30:39     readed:1705

AmazonAmazon CTO:我在打造 AWS 的 10 年里学到的 10 条经验

English text:10 from Lessons 10 Years of Amazon Web Services

CEO Jeff Bezos Amazon a few days ago in a letter to shareholders, said Amazon cloud services AWS currently has more than 1 million users, revenue in 2016 will break through $10 billion. AWS Amazon service is launched in March 2006, has been a full 10 years of time. AWS launched the first cloud service is simple storage service simple storage service (S3) and later launched the Amazon elastic computing network cloud Elastic Compute Cloud (EC2) and Amazon simple database (Amazon simpledb), Amazon Simple Queue Service (Amazon Simple Queue Service) and Amazon cloudfront, cloud services. At present, thousands of Venture Company based on AWS's data centers and services to build their own online business. Not only a large number of small companies rely on AWS Amazon cloud computing services, a lot of big companies such as Adobe, GE, Netflix and Pinterest are also in the use of AWS's Amazon services.

In Amazon's AWS service launched on the occasion of the 10th anniversary, promote the central figure in the development of AWS services, Amazon's CTO Werner Vogels in the specialized summary and share his in the AWS on-line operation 10 years learned ten experiences, I hope to inspire and draw lessons from.

1 from the first day, it is to create a continuous evolution of the system

From the first day, we have a very clear understanding that we have developed this software is a need for continuous improvement of the software, and now the development of software may not be a year after the operation of the software. We were so expected, with the increase in orders of magnitude, we need to re view and modify our existing architecture, to ensure that we can solve the problem of scalability.

However, due to many companies in different parts of the world depends on our platform provides 7 x 24 hours all-weather uninterrupted service. Therefore, we are unable to use the commonly used in the past by maintenance downtime, upgrade the system to achieve this goal. Therefore, we need to build a software component in introducing new services will not force suspended architecture from the beginning. Amazon's a great engineer Marvin Theimer once joked that and the Amazon S3 service continues to evolve and the following scenario very much like: most of us start open is a single engine Cessna aircraft, after a period of time in escalated into a Boeing 737 aircraft, and replaced by a Boeing 747 aircraft fleet, we now open is more like by Air Airbus A380 superjumbo consisting of a large aircraft fleet. From the beginning up to now we are all in normal flight by air refueling, ensure that the aircraft. At the same time, we will AWS users in the air from an old aeroplane transferred to another new aircraft, and AWS users in this whole process didn't even realize they are quietly transferred to another plane of more advanced aircraft in the.

2 prepare for unexpected failures and problems.

Is difficult to avoid failure, with the passage of time, any things are likely to appear this kind of problem: from the router to the hard disk, from the operating system to the memory cell damage of TCP packets, from the transient error to permanent failure and so on. Regardless of the high quality of the hardware or low cost components, these problems will inevitably occur.

With the expansion of the scale of services, it will become more and more important to understand this problem: for example, even if the S3 Amazon service handles hundreds of millions of transactions, even the smallest possible error will become a reality. Some of these failures and problems can be anticipated in advance, but a lot of problems can not be considered in the design and construction process.

3 to provide basic elements, rather than just to provide a large and complete unified framework

Soon, we found that many users like to continue to build their own business in the service provided by the AWS. After leaving the traditional IT hardware and data centers in the old world, they began to develop their own systems in a completely new and interesting way. Because of this, we need enough flexibility to meet the various needs of users.

One of the most important mechanisms we provide is to provide a series of elementary functions and tools for users, they can choose their favorite way to use AWS services, rather than providing a forcing the user must use the all inclusive and the unified framework. This method let our users gained huge success, even AWS later provide a lot of services are used similarly to the service mechanism and the service mechanism is many of our users have become accustomed to.

In addition, before the user really began to use our services to develop products and services, it is difficult for us to predict the user's own priorities in the end what is aware of this is very important. This is why we later introduced new service started with only minimal feature set. In this way, we can through the user's feedback to the expansion of new features in our service to better meet the user's demand.

4 automation is the key

To develop a software that needs to be detected and maintained and to develop a software that is ultimately delivered to the customer is a very big difference. In order to meet the expectations and needs of users on the reliability, performance and scalability of the product, the scale of the AWS management system needs a different mentality and method.

To achieve these goals, a key mechanism is to automate the management as much as possible so that you can avoid any possible error caused by manual operation. In order to achieve this goal, we need to create a set of control operations in the main functions of the management API. In addition, AWS is also able to help users to achieve this goal. By decomposing your application into one of the basic building blocks, each module has its own management API, so you can use automated rules for large-scale reliable, predictable operation. Automation work exactly how, a very simple method to test is to see you are still in need of SSH login to server operations, if needed, your automation work still needs to be strengthened.

5 API is eternal, once on-line can not change

6 pay close attention to and understand the use of their resources

In you for a service to develop appropriate charging mode, make sure you have a detailed data about the service of the costs and operating expenses, when you're running a business volume, low margin business but more so. AWS as a service provider, we must be very well aware of the cost of service, so we can clearly understanding based on the cost, whether we can afford to provide the service for the user. In addition, we can find some ways to reduce the cost by improving the operational efficiency, and through this way to further reduce the price of the service, so that users benefit from.

For example, in the early stages of our development, we started on the S3 Amazon service is not very clear resource costs. We were so conceived, storage and broadband costs are the first we need to consider the point of. However, after S3 Amazon has been running for some time, we began to realize that the number is as important as storage and bandwidth. If there are a large number of users of small files, in this case, even if the user requests a million times, in fact, do not take up too much storage and bandwidth resources, accounting for up to the amount of resources is actually requested. So we have to adjust the charge model, the number of requests will be put into the cost of resources, so as to ensure that AWS has a sustainable development of the business.

7 from the beginning to take into account the security issues

To protect the user's security is one of the priority issues you will always be in the first place, in AWS, of course, this is from the point of view of operation, or from the perspective of tools and mechanisms are so. Therefore, our investment in security will always be our first big investment.

One way we can learn quickly is to consider security issues in the initial stage of service design in order to build a more secure service. Security team work is not done after a service development is completed and then check to verify its security issues in the end how. The security team should be involved in product development on the first day after the start of the development process, to ensure that security issues are considered when they are just beginning to develop and run through the whole process of the development of the project. You can't compromise anything that involves security.

8 data encryption is too important

Data encryption is a key mechanism for the user to ensure that they have absolute control over who can access their own data. In the past 10 years for data encryption related tools and services experience is very poor, until the AWS start operation after the first few years, we slowly know how best to incorporate data encryption function integrated into our service.

S3 Amazon was originally provided by the server side encryption. If you want to check any disk in our data center, you are unable to access any data. Later, we launched the Amazon CloudHSM and Amazon key management service. The service allows users to encrypt data using its own secret key encryption, so there is no need for AWS to help users to manage their encryption key.

Now, in all the new AWS service, the support for data encryption has been integrated into the prototype design phase of the service. For example, in the Redshift Amazon service, each data module is encrypted with a random key, and all of these random keys are encrypted by a master key at last. Users can define the master key by themselves, so that the user is the only person who can encrypt and access the key business data or personal privacy information.

Data encryption in our business has always been a high priority of the work. We will continue to improve the data encryption, so that data encryption can be more convenient to use, so that users can better protect themselves and their customers.

9 the importance of network

AWS business has been supporting the many different kinds of loads, from high volume transaction processing to large-scale video transcoding, from the high performance parallel computing to the huge site traffic and so on. All these load the network has a very unique needs.

In the data center layout and operation innovation, AWS has developed the a unique new technology, which makes us can provide a more flexible network infrastructure to meet the needs of different users of different load demand. We always learn in this process, in order to allow users to achieve their goals, we must develop their own network hardware solutions. This also allows us to meet some of our customized needs, for example, in order to ensure the highest level of security, we can separate the different users on the network to each other.

Another AWS through its own network hardware and software solutions to further help users to improve the performance of the example is to solve the network between the virtual machine access. Because the network access is a shared resource, users often encounter network congestion problems before. AWS later developed to support the NIC single IO virtualization technology, it can let us each virtual machine virtual NIC, this approach effectively reduces the network delay is more than two times.

10 no gatekeeper

In order to provide users with a more broad and depth of the service platform, the AWS team has developed and provided more and more services and functions. But AWS is far from limited to these features and services we have already provided, and many of our partners based on AWS services provided by the to further expand and enrich the entire AWS ecosystem.

For example, our partner Stripe using our services to provide payment services, as well as the use of AWS services, such as Twilio services provided by the network. Many of our users develop their own platform based on AWS services to solve some of the problems in their vertical fields. For example, Philips developed for health data management, digital platform Healthsuite digital platform, Ohpen in the AWS foundation developed a retail banking platform, Eagle genomics developed gene processing platform, examples and so on.

China IT News APP

Download China IT News APP

Please rate this news

The average score will be displayed after you score.

Post comment

Do not see clearly? Click for a new code.

User comments