The official blog of the Engineering team at LinkedIn
Hosted Search: LinkedIn Search as a managed serviceSearch functionality is a core part of most data-driven products, and is used widely at LinkedIn. We have long provided a central platform for search functionalities; however, it was not fully managed in the sense that the application teams needed to own and operate the corresponding resources. As data needs grow and an increasingly high number of products want to integrate search, we discovered a need for a fully managed self-service platform to completely democratize search for all of our product teams. In this post, we will talk about Hosted Search, our new search solution that […]
12/08/22
Our Approach to Research and A/B Testing We are constantly striving to improve the experience on LinkedIn for our members and customers, with research and experimentation, such as A/B Testing, playing a key role in that work. Nearly a decade ago, I discussed the importance of these techniques in our journey to create economic opportunity for every member of the global workforce. Today we have a strong principled approach to how we design and run A/B tests on everything from UI designs to AI algorithms, and feature launches to bug fixes. As our platform continues to grow and evolve, these techniques have become even more […]
12/07/22
Operating System Snapshot AutomationCo-authors: Rohit Jamuar, Tianxin Zhou Introduction LinkedIn has a large set of physical servers geographically spread across several locations. Every application is hosted on a physical server and is distributed and managed across one of these hosts. With a reasonably sizable footprint of servers in data centers, LinkedIn is responsible for ensuring that these hosts are always on an operating system (OS) version deemed the “latest and greatest” for all intents and purposes. The Production Systems Software Engineering (PSSE) organization within LinkedIn has taken the responsibility […]
12/06/22
Building LinkedIn's Skills Graph to Power a Skills-First WorldCo-authors: Sofus Macskássy, Yi Pan, Ji Yan, Yanen Li, Di Zhou, Shiyong Lin As industries rapidly evolve, so do the skills necessary for success. Skill sets for jobs globally have changed by 25% since 2015 and this number is expected to double by 2027. Yet, we’ve long relied on insufficient and unequal signals when evaluating talent and predicting success - who you know, where you went to school, or who your last employer was. If we look at the labor market instead through the lens of skills - the skills you have and the skills a role or industry demands - we can create a […]
11/30/22
TopicGC: How LinkedIn cleans up unused metadata for its Kafka clustersIntroduction Apache Kafka is an open-sourced event streaming platform where users can create Kafka topics as data transmission units, and then publish or subscribe to the topic with producers and consumers. While most of the Kafka topics are actively used, some are not needed anymore because business needs changed or the topics themselves are ephemeral. Kafka itself doesn’t have a mechanism to automatically detect unused topics and delete them. It is usually not a big concern, since a Kafka cluster can hold a considerable amount of topics, hundreds to thousands. However, if the […]
11/29/22
Render Models at LinkedInCo-Authors: Mahesh Vishwanath, Eric Babyak, Sonali Bhadra, Umair Saeed Introduction We use render models for passing data to our client applications to describe the content (text, images, buttons etc.) and the layout to display on the screen. This means most of such logic is moved out of the clients and centralized on the server. This enables us to deliver new features faster to our members and customers while keeping the experience consistent and being responsive to change. Overview Traditionally, many of our API models tend to be centered around the raw data that’s needed for […]
11/22/22
(Re)building Threat Detection and Incident Response at LinkedInCo-authors: Sagar Shah and Jeff Bollinger The Moonshot LinkedIn connects and empowers more than 875 million members and over the past few years, has undergone tremendous growth. As an integral part of the Information Security organization at LinkedIn, the Threat Detection and Incident Response team (aka SEEK) defends LinkedIn against computer security threats. As we continue to experience a rapid growth trajectory, the SEEK team decided to reimagine its capabilities and the scale of its monitoring and response solutions. What SEEK set out to do was akin to shooting for the moon, so […]
11/09/22
Career stories: From Hollywood videographer to frontend engineerOriginally an LA-based videographer and self-taught developer, Kiope wanted to turn her hobby into a career. Following a coding bootcamp and a stint at a startup, she joined LinkedIn as a frontend (UI) engineer. Based in San Francisco, Kiope speaks about her meaningful accessibility work, impact at scale, and career growth as an aspiring manager. After my undergrad degree in film studies, I jetted off to Los Angeles to start my career in Hollywood as a talent agency assistant. However, I’ve also always had a passion for coding, ever since I taught myself how to build websites as a […]
11/07/22
How LinkedIn Ditched the "One Size Fits All" Hiring Approach for InfoSec and WonWhat drew me to LinkedIn was the chance to take on the challenge of scaling and adapting the security framework for the world’s largest professional network. Today, our Information Security (InfoSec) team is responsible for protecting LinkedIn’s infrastructure and data for our members, customers, and employees. As our platform has scaled, our security programs and strategies have grown in tandem to keep up with the company’s trajectory. But we haven’t stopped there. We want to be more than just protectors of our members, customers, and all the data we process internally and […]
10/31/22
Career stories: Four engineering careers. One LinkedIn. LinkedIn’s Next Play culture celebrates transformational growth and internal mobility, and engineering leader Shalini is its biggest champion. Based in Silicon Valley, this mom of three walks us through her impactful career journey from our LinkedIn consumer and data teams, then to working on the LinkedIn Sales Solutions team, to now working for LinkedIn Talent Solutions. Before joining LinkedIn, I worked my way up in Silicon Valley’s consumer software engineering space as a backend (apps) engineer. As my career propelled forward, and I transitioned to managing a team of backend, […]
10/27/22
Career stories: Mobilizing learners worldwideAfter stints at a Madrid startup and a Chicago financial-services company, Android mobile engineer Jose was looking for his ideal role in the Bay Area. Now based in San Francisco, he dives into the rewarding technical challenges his team is collaborating on, and how he’s having a meaningful impact on millions of learners worldwide as a core member of the LinkedIn Learning mobile team. Making the move from Spain to the U.S. I’m originally from Córdoba, Spain, where I first developed my passion for coding as a teenager. My love for coding led me to pursue an engineering degree in […]
10/18/22
LinkedIn’s GraphQL journey for integrations and partnerships: How we accelerated development by 90%Co-authors: Mimi Chen, Calvin Lei, and Amit Yadav Background LinkedIn’s mission is to connect the world’s professionals to make them more productive and successful. One way we advance this mission is by partnering with other organizations to deliver world class integrations. We are developing a platform-as-a-service (PaaS) that provides exploratory access, insights, and conflation with LinkedIn’s Economic Graph to enable product integrations with strategic partners and customers. Our goal is to build a best-in-class API platform that is easy to use, efficient, and easy for our […]
10/06/22
Skyfall: eBPF agent for infrastructure observabilityCurrently, LinkedIn infrastructure is composed of hundreds of thousands of hosts across multiple data centers. Observability into our infrastructure makes it possible for us to focus on the health and performance of our critical services to provide the best experience to our members. With LinkedIn's large infrastructure growth over the past few years, observability has become more critical to pinpoint the potential root causes for any infrastructure failure or anomaly. There are a few elegant in-house monitoring systems at LinkedIn that provide network switch level metrics, logs, […]
10/04/22
Super Tables: The road to building reliable and discoverable data productsCo-authors: David Lu, Hong Liu, Thomas Kwan, Christopher Harris, Weiping Si Many companies, including LinkedIn, have experienced exponential data growth ever since the Apache Hadoop adoption a decade ago. With a proliferation of self-service data authoring tools and publishing platforms, different teams have created and shared datasets to address business needs quickly. While the use of self-service tools and platforms was a scalable and agile way to unlock data value by various teams, it introduced multiple issues: 1) multiple similar datasets often led to inconsistent results and […]
09/28/22
Open Sourcing Venice – LinkedIn’s Derived Data PlatformWe are proud to announce the open sourcing of Venice, LinkedIn’s derived data platform that powers more than 1800 of our datasets and is leveraged by over 300 distinct applications. Venice is a high-throughput, low-latency, highly-available, horizontally-scalable, eventually-consistent storage system with first-class support for ingesting the output of batch and stream processing jobs. Venice entered production at the end of 2016 and has been scaling continuously ever since, gradually replacing many other systems, including Voldemort and several proprietary ones. It is used by the […]
09/26/22
Real-time analytics on network flow data with Apache PinotThe LinkedIn infrastructure has thousands of services serving millions of queries per second. At this scale, having tools that provide observability into the LinkedIn infrastructure is imperative to ensure that issues in our infrastructure are quickly detected, diagnosed, and remediated. This level of visibility helps prevent the occurrence of outages so we can deliver the best experience for our members. To provide observability, there are various data points that need to be collected, such as metrics, events, logs, and flows. Once collected, the data points can then be processed […]
09/13/22
Feathr joins LF AI & Data FoundationCo-authors: Hangfei Lin, Jinghui Mo We’re excited to announce today that Feathr is joining LF AI & Data, the Linux Foundation’s umbrella foundation supporting open source innovation in artificial intelligence (AI) and data. Feathr is a feature store that simplifies machine learning (ML) feature serving and improves developer productivity. “We're excited to welcome Feathr to LF AI & Data and for it to be part of our technical project portfolio (41 projects and growing) with a community of over 17K developers,” said Dr. Ibrahim Haddad, Executive Director of LF AI & Data. “We aim to […]
09/12/22
Career stories: Rejoining LinkedIn to scale our media infrastructureOriginally from Argentina, systems & infrastructure engineering leader Federico was a founding member of the Media Infrastructure team in 2015. Now based in Bellevue, Wash., Federico shares how his supportive mentor, LinkedIn’s “sweet spot” scale, and the distinctive engineering challenges here ultimately brought him back to LinkedIn in 2019. My love for engineering started in my home country of Argentina. After working as an engineer in a corporate setting for a few years, I decided to start my own company focused on custom software development. I loved the interesting problems I […]
09/06/22
Operating system upgrades at LinkedIn’s scaleCo-authors: Hengyang Hu, Dinesh Dhakal, Kalyanasundaram Somasundaram Introduction Completing recurring operating system (OS) upgrades on time and without impacting users can be challenging. For LinkedIn, completing these upgrades at a massive scale has its own complexities as we’re often facing multiple upgrades. To secure our platform and protect our members’ data, we needed a fast and reliable OS upgrade framework with little to no human intervention. In this blog, we’ll introduce a newly developed system, Operating System Upgrade Automation (OSUA), which allows LinkedIn to scale […]
08/31/22
Challenges and practical lessons from building a deep-learning-based ads CTR prediction modelCo-authors: Ruoyan Wang, Sirou Zhu, Chengming Jiang Introduction At LinkedIn, our ads business is powered by click-through-rate (CTR) prediction, a core machine learning model. CTR prediction estimates the probability of clicks between a LinkedIn member and a potential advertisement. That probability is then used for ads auctions, which decide the order of ads being displayed to members. A better CTR model can enhance the member and advertiser experience by bringing more relevant ads and more efficient advertiser budget spending. In the past, we predicted ads CTR through a GLMix […]