Quick Notes

Things that came on the way

HBase Shell

HBase shell is a Ruby based shell which can be used to interact with HBase cluster and perform data definition and data manipulation tasks. The shell is made available as part of the HBase code base and by default gets installed on all the nodes of the HBase cluster. In order to access the shell, HBase software need to be installed on the machine from where the shell is invoked and the PATH environment variable updated with the directory where the hbase shell program is stored. Commonly users and administrators access the shell through one of the nodes in the HBase cluster. Following is a quick introduction to frequently used hbase shell commands. HBase shell is invoked using the hbase shell command which in turn provides the user with a command prompt to enter the commands supported by the particular version of HBase.

Java Direct ByteBuffer Performance Advantages and Considerations

During execution, objects/variables created by Java programs gets their space allocated in the JVM heap memory. The total amount of heap memory available for a JVM is determined by the value set to -Xmx parameter when starting the Java process. When object allocated is released by the Java program, the corresponding memory is made available for later use by the JVM garbage collection (GC) process.

The GC process gets invoked typically when the amount of free memory in the JVM falls below a certain threshold. At a very high level, the GC process involves identification of objects which are not used any more i.e. not referenced anymore, releasing the memory and compacting the memory to reduce memory fragmentation. Readers who are interested in understanding the details of GC process can find it here. As one can imagine, the time it takes to complete the GC process will increase with the increase in size of the Java heap memory since it takes more time to identify the objects which can be released and also to perform compaction.

Secure All Applications Please

When you work with enterprises often you get to see batch applications storing credentials to login to systems like databases or messaging infrastructure or other enterprise applications in config files as plain text. Also these batch applications don’t get the same attention as customer facing applications when it comes to security. If you have similar application configurations and the thought is that these batch applications are behind the firewall in a DMZ and hence pose less risk, think again. As anyone who work in computer forensics/security can attest, most often data breach is perpetrated by an insider and these instances never get reported or get media attention. If you are looking for numbers here is a summary of 2012 security incident report from Forrester.

To de-risk scenarios like these, the solution doesn’t have to be too complex. It can be a matter of following a simple process similar to the following across the enterprise,

Chef HWRP Using an Example

Heavy Weight Resource Provider (HWRP) is one of the options Chef offers to create custom resources and the other being LWRP. It would be good to read the notes on LWRP to understand the context and the difference between LWRP and HWRP.

Similar to LWRP, HWRP requires a resource definition and the corresponding provider. The key difference is that there are no DSL in the HWRP as in LWRP and everything is coded in Ruby code. So taking the same example of HDFS directory resource used in the notes on LWRP, the following is the skeleton of the resource definition.

Chef LWRP Using HDFS Directory as an Example

Chef provides a large set of resources to work with. But there are situations where resources provided by Chef may not be sufficient. For e.g, distributed file systems can’t be handled by the file system related resources (file, directory etc) which comes out of the box with Chef. Being flexible and customizable, Chef provides two options (LWRP, HWRP) for users to create their own resources.