How Zurich Insurance coverage Group constructed a log administration resolution on AWS

[ad_1]

This put up is written in collaboration with Clarisa Tavolieri, Austin Rappeport and Samantha Gignac from Zurich Insurance coverage Group.

The expansion in quantity and variety of logging sources has been growing exponentially over the previous couple of years, and can proceed to extend within the coming years. Consequently, prospects throughout all industries are dealing with a number of challenges akin to:

  • Balancing storage prices towards assembly long-term log retention necessities
  • Bandwidth points when shifting logs between the cloud and on premises
  • Useful resource scaling and efficiency points when making an attempt to investigate large quantities of log information
  • Maintaining tempo with the rising storage necessities, whereas additionally having the ability to present insights from the info
  • Aligning license prices for Safety Info and Occasion Administration (SIEM) distributors with log processing, storage, and efficiency necessities. SIEM options aid you implement real-time reporting by monitoring your setting for safety threats and alerting on threats as soon as detected.

Zurich Insurance coverage Group (Zurich) is a number one multi-line insurer offering property, casualty, and life insurance coverage options globally. In 2022, Zurich started a multi-year program to speed up their digital transformation and innovation by means of the migration of 1,000 functions to AWS, together with core insurance coverage and SAP workloads.

The Zurich Cyber Fusion Middle administration group confronted related challenges, akin to balancing licensing prices to ingest and long-term retention necessities for each enterprise software log and safety log information inside the present SIEM structure. Zurich wished to establish a log administration resolution to work along side their present SIEM resolution. The brand new strategy would want to supply the flexibleness to combine new applied sciences akin to machine studying (ML), scalability to deal with long-term retention at forecasted progress ranges, and supply choices for price optimization. On this put up, we talk about how Zurich constructed a hybrid structure on AWS incorporating AWS companies to fulfill their necessities.

Resolution overview

Zurich and AWS Skilled Companies collaborated to construct an structure that addressed decoupling long-term storage of logs, distributing analytics and alerting capabilities, and optimizing storage prices for log information. The answer was primarily based on categorizing and prioritizing log information into precedence ranges between 1–3, and routing logs to completely different locations primarily based on precedence. The next diagram illustrates the answer structure.

How Zurich Insurance coverage Group constructed a log administration resolution on AWS

The workflow steps are as follows:

  1. All the logs (P1, P2, and P3) are collected and ingested into an extract, rework, and cargo (ETL) service, AWS Companion Cribl’s Stream product, in actual time. Capturing and streaming of logs is configured per use case primarily based on the capabilities of the supply, akin to utilizing built-in forwarders, putting in brokers, utilizing Cribl Streams, and utilizing AWS companies like Amazon Information Firehose. This ETL service performs two features earlier than information reaches the analytics layer:
    1. Information normalization and aggregation – The uncooked log information is normalized and aggregated within the required format to carry out analytics. The method consists of normalizing log subject names, standardizing on JSON, eradicating unused or duplicate fields, and compressing to scale back storage necessities.
    2. Routing mechanism – Upon finishing information normalization, the ETL service will apply needed routing mechanisms to ingest log information to respective downstream programs primarily based on class and precedence.
  2. Precedence 1 logs, akin to community detection & response (NDR), endpoint detection and response (EDR), and cloud menace detection companies (for instance, Amazon GuardDuty), are ingested on to the present on-premises SIEM resolution for real-time analytics and alerting.
  3. Precedence 2 logs, akin to working system safety logs, firewall, id supplier (IdP), e-mail metadata, and AWS CloudTrail, are ingested into Amazon OpenSearch Service to allow the next capabilities. Beforehand, P2 logs had been ingested into the SIEM.
    1. Systematically detect potential threats and react to a system’s state by means of alerting, and integrating these alerts again into Zurich’s SIEM for bigger correlation, decreasing by roughly 85% the quantity of knowledge ingestion into Zurich’s SIEM. Finally, Zurich plans to make use of ML plugins akin to anomaly detection to boost evaluation.
    2. Develop log and hint analytics options with interactive queries and visualize outcomes with excessive adaptability and pace.
    3. Scale back the typical time to ingest and common time to go looking that accommodates the growing scale of log information.
    4. Sooner or later, Zurich plans to make use of OpenSearch’s safety analytics plugin, which may also help safety groups rapidly detect potential safety threats by utilizing over 2,200 pre-built, publicly accessible Sigma safety guidelines or create customized guidelines.
  4. Precedence 3 logs, akin to logs from enterprise functions and vulnerability scanning instruments, aren’t ingested into the SIEM or OpenSearch Service, however are forwarded to Amazon Easy Storage Service (Amazon S3) for storage. These may be queried as wanted utilizing one-time queries.
  5. Copies of all log information (P1, P2, P3) are despatched in actual time to Amazon S3 for extremely sturdy, long-term storage to fulfill the next:
    1. Lengthy-term information retentionS3 Object Lock is used to implement information retention per Zurich’s compliance and regulatory necessities.
    2. Price-optimized storageLifecycle insurance policies robotically transition information with much less frequent entry patterns to lower-cost Amazon S3 storage lessons. Zurich additionally makes use of lifecycle insurance policies to robotically expire objects after a predefined interval. Lifecycle insurance policies present a mechanism to steadiness the price of storing information and assembly retention necessities.
    3. Historic information evaluation – Information saved in Amazon S3 may be queried to fulfill one-time audit or evaluation duties. Finally, this information could possibly be used to coach ML fashions to assist higher anomaly detection. Zurich has completed testing with Amazon SageMaker and has plans so as to add this functionality within the close to future.
  6. One-time question evaluation – Easy audit use circumstances require historic information to be queried primarily based on completely different time intervals, which may be carried out utilizing Amazon Athena and AWS Glue analytic companies. Through the use of Athena and AWS Glue, each serverless companies, Zurich can carry out easy queries with out the heavy lifting of operating and sustaining servers. Athena helps a wide range of compression codecs for studying and writing information. Subsequently, Zurich is ready to retailer compressed logs in Amazon S3 to attain cost-optimized storage whereas nonetheless having the ability to carry out one-time queries on the info.

As a future functionality, supporting on-demand, advanced question, evaluation, and reporting on giant historic datasets could possibly be carried out utilizing Amazon OpenSearch Serverless. Additionally, OpenSearch Service helps zero-ETL integration with Amazon S3, the place customers can question their information saved in Amazon S3 utilizing OpenSearch Service question capabilities.

The answer outlined on this put up gives Zurich an structure that helps scalability, resilience, price optimization, and suppleness. We talk about these key advantages within the following sections.

Scalability

Given the amount of knowledge at present being ingested, Zurich wanted an answer that might fulfill present necessities and supply room for progress. On this part, we talk about how Amazon S3 and OpenSearch Service assist Zurich obtain scalability.

Amazon S3 is an object storage service that provides industry-leading scalability, information availability, safety, and efficiency. The entire quantity of knowledge and variety of objects you’ll be able to retailer in Amazon S3 are just about limitless. Primarily based on its distinctive structure, Amazon S3 is designed to exceed 99.999999999% (11 nines) of knowledge sturdiness. Moreover, Amazon S3 shops information redundantly throughout a minimal of three Availability Zones (AZs) by default, offering built-in resilience towards widespread catastrophe. For instance, the S3 Customary storage class is designed for 99.99% availability. For extra data, try the Amazon S3 FAQs.

Zurich makes use of AWS Companion Cribl’s Stream resolution to route copies of all log data to Amazon S3 for long-term storage and retention, enabling Zurich to decouple log storage from their SIEM resolution, a standard problem dealing with SIEM options at present.

OpenSearch Service is a managed service that makes it simple to run OpenSearch with out having to handle the underlying infrastructure. Zurich’s present on-premises SIEM infrastructure is comprised of greater than 100 servers, all of which should be operated and maintained. Zurich hopes to scale back this infrastructure footprint by 75% by offloading precedence 2 and three logs from their present SIEM resolution.

To assist geographies with restrictions on cross-border information switch and to satisfy availability necessities, AWS and Zurich labored collectively to outline an Amazon OpenSearch Service configuration that may assist 99.9% availability utilizing a number of AZs in a single area.

OpenSearch Service helps cross-region and cross-cluster queries, which helps with distributing evaluation and processing of logs with out shifting information, and gives the flexibility to mixture data throughout clusters. Since Zurich plans to deploy a number of OpenSearch domains in several areas, they may use cross-cluster search performance to question information seamlessly throughout completely different regional domains with out shifting information. Zurich additionally configured a connector for his or her present SIEM to question OpenSearch, which additional permits distributed processing from on premises, and permits aggregation of knowledge throughout information sources. Consequently, Zurich is ready to distribute processing, decouple storage, and publish key data within the type of alerts and queries to their SIEM resolution with out having to ship log information.

As well as, a lot of Zurich’s enterprise models have logging necessities that may be happy utilizing the identical AWS companies (OpenSearch Service, Amazon S3, AWS Glue, and Amazon Athena). As such, the AWS elements of the structure had been templatized utilizing Infrastructure as Code (IaC) for constant, repeatable deployment. These elements are already getting used throughout Zurich’s enterprise models.

Price optimization

In interested by optimizing prices, Zurich needed to contemplate how they’d proceed to ingest 5 TB per day of safety log data only for their centralized safety logs. As well as, strains of companies wanted related capabilities to satisfy necessities, which might embrace processing 500 GB per day.

With this resolution, Zurich can management (by offloading P2 and P3 log sources) the portion of logs which might be ingested into their major SIEM resolution. Consequently, Zurich has a mechanism to handle licensing prices, in addition to enhance the effectivity of queries by decreasing the quantity of knowledge the SIEM must parse on search.

As a result of copies of all log information are going to Amazon S3, Zurich is ready to reap the benefits of the completely different Amazon S3 storage tiers, akin to utilizing S3 Clever-Tiering to robotically transfer information amongst Rare Entry and Archive Entry tiers, to optimize the price of retaining a number of years’ value of log information. When information is moved to the Rare Entry tier, prices are diminished by as much as 40%. Equally, when information is moved to the Archive Prompt Entry tier, storage prices are diminished by as much as 68%.

Discuss with Amazon S3 pricing for present pricing, in addition to for data by area. Shifting information to S3 Rare Entry and Archive Entry tiers gives a big price financial savings alternative whereas assembly long-term retention necessities.

The group at Zurich analyzed precedence 2 log sources, and primarily based on historic analytics and question patterns, decided that solely the newest 7 days of logs are sometimes required. Subsequently, OpenSearch Service was right-sized for retaining 7 days of logs in a sizzling tier. Slightly than configuring UltraWarm and chilly storage tiers for OpenSearch Service, copies of the remaining logs had been concurrently being despatched to Amazon S3 for long-term retention and could possibly be queried utilizing Athena.

The mix of cost-optimization choices is projected to scale back by 53% the price of per GB of log information ingested and saved for 13 months when in comparison with the earlier strategy.

Flexibility

One other key consideration for the structure was the flexibleness to combine with present alerting programs and information pipelines, in addition to the flexibility to include new expertise into Zurich’s log administration strategy. For instance, Zurich additionally configured a connector for his or her present SIEM to question OpenSearch, which additional permits distributed processing from on premises and permits aggregation of knowledge throughout information sources.

Inside the OpenSearch Service software program, there are alternatives to broaden log evaluation utilizing safety analytics with predefined indicators of compromise throughout frequent log sorts. OpenSearch Service additionally affords the aptitude to combine with ML capabilities akin to anomaly detection and alert correlation to boost log evaluation.

With the introduction of Amazon Safety Lake, there’s one other alternative to broaden the answer to extra effectively handle AWS logging sources and add to this structure. For instance, you should use Amazon OpenSearch Ingestion to generate safety insights on safety information from Amazon Safety Lake.

Abstract

On this put up, we reviewed how Zurich was capable of construct a log information administration structure that supplied the scalability, flexibility, efficiency, and cost-optimization mechanisms wanted to satisfy their necessities.

To be taught extra about elements of this resolution, go to the Centralized Logging with OpenSearch implementation information, assessment Querying AWS service logs, or run by means of the SIEM on Amazon OpenSearch Service workshop.


Concerning the Authors

Clarisa Tavolieri is a Software program Engineering graduate with {qualifications} in Enterprise, Audit, and Technique Consulting. With an in depth profession within the monetary and tech industries, she makes a speciality of information administration and has been concerned in initiatives starting from reporting to information structure. She at present serves because the International Head of Cyber Information Administration at Zurich Group. In her position, she leads the info technique to assist the safety of firm property and implements superior analytics to boost and monitor cybersecurity instruments.

Austin RappeportAustin Rappeport is a Pc Engineer who graduated from the College of Illinois Urbana/Champaign in 2011 with a spotlight in Pc Safety. After commencement, he labored for the Federal Vitality Regulatory Fee within the Workplace of Electrical Reliability, working with the North American Electrical Reliability Company’s Important Infrastructure Safety Requirements on each the audit and enforcement aspect, in addition to requirements improvement. Austin at present works for Zurich Insurance coverage because the International Head of Detection Engineering and Automation, the place he leads the group answerable for utilizing Zurich’s safety instruments to detect suspicious and malicious exercise and enhance inside processes by means of automation.

Samantha Gignac is a International Safety Architect at Zurich Insurance coverage. She graduated from Ferris State College in 2014 with a Bachelor’s diploma in Pc Techniques & Community Engineering. With expertise within the insurance coverage, healthcare, and provide chain industries, she has held roles akin to Storage Engineer, Threat Administration Engineer, Vulnerability Administration Engineer, and SOC Engineer. As a Cybersecurity Architect, she designs and implements safe community programs to guard organizational information and infrastructure from cyber threats.

Claire Sheridan is a Principal Options Architect with Amazon Internet Companies working with international monetary companies prospects. She holds a PhD in Informatics and has greater than 15 years of {industry} expertise in tech. She loves touring and visiting artwork galleries.

Jake Obi is a Principal Safety Guide with Amazon Internet Companies primarily based in South Carolina, US, with over 20 years’ expertise in data expertise. He helps monetary companies prospects enhance their safety posture within the cloud. Previous to becoming a member of Amazon, Jake was an Info Assurance Supervisor for the US Navy, the place he labored on a big satellite tv for pc communications program in addition to internet hosting authorities web sites utilizing the general public cloud.

Srikanth Daggumalli is an Analytics Specialist Options Architect in AWS. Out of 18 years of expertise, he has over a decade of expertise in architecting cost-effective, performant, and safe enterprise functions that enhance buyer reachability and expertise, utilizing large information, AI/ML, cloud, and safety applied sciences. He has constructed high-performing information platforms for main monetary establishments, enabling improved buyer attain and distinctive experiences. He’s specialised in companies like cross-border transactions and architecting sturdy analytics platforms.

Freddy Kasprzykowski is a Senior Safety Guide with Amazon Internet Companies primarily based in Florida, US, with over 20 years’ expertise in data expertise. He helps prospects undertake AWS companies securely in response to {industry} greatest practices, requirements, and compliance laws. He’s a member of the Buyer Incident Response Workforce (CIRT), serving to prospects throughout safety occasions, a seasoned speaker at AWS re:Invent and AWS re:Inforce conferences, and a contributor to open supply initiatives associated to AWS safety.

[ad_2]

Leave a Reply

Your email address will not be published. Required fields are marked *