Youtube is the world's largest video-sharing and social media platform, founded in 2005 and acquired by Google in 2006. It is the world's second most popular search engine. Youtube has a large volume of unstructured video content, and its popularity is increasing at a rapid pace.
Anyone with a Google account can upload content, and any user (with or without a Google account) can freely share and view content.
Every day, YouTube consumes more than 1 billion hours of content. As a result, various technologies and programming languages are used to manage large amounts of data. Let's take a look at the programming language and technology used by YouTube, which is as follows:
Table of Contents
Before Google acquired Youtube, the majority of the code was initially written in PHP, but there were many restrictions and clutters in PHP at the time, so after acquiring Youtube by Google, they moved to Python as one of the core parts of its backend programming.
Python is used in many Google internal systems and APIs. Python is used by YouTube because it is easier to implement new ideas, is more maintainable, and is more secure than PHP. Python is used by entire YouTube sites for a variety of purposes such as viewing videos, controlling website templates, administering videos, data analysis, data visualization, and so on.
Java is a very high-performance server-side programming language. YouTube employs Java because it is capable of handling extremely high traffic volumes. Java was used in the Guice platform. Guice is a Java open-source software framework released under the Apache License by Google.
Go has cross-platform capabilities, which is a significant benefit. In 2017, Google decided to port the Python wrapper to Go because Go includes built-in tooling, it has quick compile times and deploys, as well as simple troubleshooting. (source)
Go has excellent support for handling concurrency and parallelism but python lacks support.
Large corporations use C and C++ as their primary programming languages. C and C++ are also used for app core functionality such as video processing.
Initially, YouTube used MySQL to store user and video meta information such as users, tags and descriptions, ad information, country-specific customizations, comments, notes, tags, and so on. Youtube stores video content in a filesystem in a drive managed by Google File System in Google Data Centers and the location path of the videos may be stored in a MySQL database table.
I believe YouTube uses MySQL rather than NoSQL because its data is highly structured.SQL is good for managing structured data. Also, Youtube required consistency in the video's likes and dislikes, which required Atomicity to be fulfilled and that can easily be managed by MySQL.
Vitess was designed to work alongside MySQL to improve MySQL performance by enabling horizontal scaling of MySQL vis sharding.
Vitess also handles failovers and backups automatically. Vitess helps to rewrite resource-intensive queries and implementing caching improves server management and database performance.
BigTable helps to manage youtube structured data across thousands of servers.
Youtube was initially developed using PHP. Later Python replaced most and most of the PHP code.
Apache Server: Apache Server is the most popular open-source free web server for delivering web content through the internet.
source) Cowboy Web Server: Cowboy is an Erlang/OTP HTTP server that is small, fast, and modern. It is designed with low latency and low memory usage in mind. Because it uses Ranch to manage connections, it can be easily embedded in any application.(
Google Cloud Platform
Google Cloud Platform is a cloud computing service provided by Google. Many Google products, such as YouTube, use the Google Cloud Platform. Google Cloud Platform offers computing, data storage, data analytics, and machine learning capabilities.
CDN stands for Content Delivery Network.
Youtube makes use of Cloud CDN for fast, dependable web and video content delivery on a global scale and reach. It reduces latency while delivering high-quality content to the end user (source).
Application and Data
Google Compute Engine
Google Compute Engine is an infrastructure as a service (IaaS) that allows users to create and run high-performance configurable virtual machines in Google Data Center.
Google App Engine
Google App Engine is a cloud computing platform as a service that enables the development of scalable web and mobile applications in Google-managed data centers.
Finance And Accounting
Google Tag Manager
ZooKeeper: Zookeeper for node coordination.
Memcache: For Caching