Before deep dive into Slack Technology Stack. Let's warm up with Slack.
Table of Contents
- What is Slack?
- Slack Technology Stack
What is Slack?
Slack is a cloud-based collaboration software that was initially released on 14 August 2013 by American Software company Slack technologies whose HQ is located in California. and later Slack is owned by Salesforce in 2021. Initially, it was just messaging app and later Slack transforms and focus on organizational communication and Collaboration Software with much more features than messaging.
Slack is one of the successful and popular applications in business today. Among Fortune 100 companies, 65 uses Slack which is a great achievement of Slack. Large companies like IBM, Paypal, Airbnb, Amazon use Slack. Some of the top competitors of Collaborations tools like Slack are Microsoft Team, Zoom, Skype, etc.
In the graph below we can clearly see in October 2019, Slack had nearly 12 million daily active users. So its curve is increasing at a rapid pace.
As an IT guy, I use Slack on daily basis to communicate with my coworker. And I am excited to discuss with you about Slack TechnologiesSo, now let's talk about Slack Technology Stack. Slack uses various types of technology for building the system.
Slack Technology Stack
For Web Client Application
For Desktop Application
For Android Application
For android applications, Java and Kotlin are used. Kotlin is more flexible than Java as you can develop applications in a different way instead of the traditional OOP approach. Kotlin consists of features of both OOP and functional programming whereas Java just has OOP. Kotlin helps in building high-performance applications. Kotlin is endorsed by Google Inc.
For IOS Application
For IOS application Objective C and Swift are used.
In Slack, initially, PHP 5 was used as a backend and later switched to HHVM in 2016 which helps to run PHP code faster. Hack acts like a superset of PHP which has lots of improvement over PHP.
For Data Storage
Initially for data storage, Slack used to use MYSQL for active to the active configuration. Then Slack wants in scaling and performance problems. It was initially built with sharded architecture. In search of scalability, Vitess is used. Slack started migrating to Vitess in 2017 and migrates are completed now. Vitess is a database clustering system used for horizontal scaling, deploying, and managing large clusters of the open-source database instance. Vitess works perfectly with MySQL.High availability, scalability, operability, extensibility, and performance are critical in slack so Vitess is good for it. Today there are multiple Vitess clusters running in different geographical regions around the world for Slack.
Caching is a practice of retrieving stored data with high performance and caching is introduced for reducing the load on the server for frequent calls to the content. Memcached, MCRouter is used for Caching in Slack.
Flannel is used for application-level edge caching. Flannel is used to reduce connection time when loading slack, switching channels and reconnecting to slack. It is a service that caches applications at the application level. Flannel caches relevant data of users, channels, bots, and more when the client is initiated. Then, on-demand, it provides query APIs to clients to fast serve the result.
For Search, Slack uses Solr. Solr is used for full-text search in Slack. Solr used Lucene Java search library at the background
For Real-Time Messaging
For Real-time messaging Websockets are used. The stack provides real-time communication which serves both historical information via web API and real-time data via WebSocket Service so that people get the latest info about what is happening in the team.
The first users and channels will receive information about the team from the Web API via Cloud Front and AWS ELB, and they will be connected to the Web Socket Service to receive up-to-date information.
As traffic increases in the server, the load has to be balanced among the various servers. For load balancing, HAProxy is used. A slack used a fleet of HQProxy instances behind a layer 4 load balancer to distribute to the web app tier.
Because running a Websocket stack through an ELB is difficult, HQProxy is also used to balance load balancers. HQProxy assists in better controlling user affinity to specific backends being served, as well as controlling the deployment and failover process for those long-lived connections.
For More Detail information please follow the below video.
For Service Discovery
Slack uses Consul for discovering and configuring services. Consul allows maintaining reliable and secure connectivity, centralized registry of where each service is located within their network, making it less complicated to move from one web app or service to another even as new service nodes are introduced or removed. When Konsul is used in conjunction with HAProxy, load balancer configuration can be automated.
For Server Configuration and Management
Terraform, Chef, and Kubernetes are some tools used for server configuration and Management.
Terraform is an open-source Infrastructure Code(IAC) tool that helps to manage the entire lifecycle of infrastructure using infrastructure as code
safely and efficiently.
Chef is an open-source cloud infrastructure automation platform that makes it simple to set up, configure, deploy, test, and manage servers in any environment (on-premises (private) hosted, virtual hosted, or cloud-hosted) and multiple platforms like Windows, Ubuntu, Solaris, etc. It helps to manage the infrastructure by writing code rather than the tedious manual process.
Kubernetes is a virtual machine alternative that automates resource management and allows you to easily scale your application. It allows developers to share dependencies and software with IT operations, which results in faster code operation and delivery.
For Async Task Queuing System
Kafka and Redis are used for async task queuing.
For Data Warehouse
Presto, Hive, Spark are some of the few tools used for the data warehouse.
Presto is a distributed SQL engine designed specifically for interactive queries. It is a quick way to answer ad-hoc questions and explore smaller datasets etc.
Hive is used for large datasets. It transforms SQL-like queries into Map Reduce jobs. It can handle larger joins with ease.
Spark is an open-source data processing framework that enables us to write more stream processing and batch processing tasks using a more expressive language such as Scala rather than SQL-like queries. Spark also allows us to cache data in memory to speed up computations.