Tech Musings

Wednesday, December 14, 2016

Cassandra:nodetool usage

Nodetool is a OOTB tool from Cassandra. It provides various options to manage the keyspace. There were three useful options I came across:

1) I had to copy the keyspace from one server to another server. Since the source is from multiple servers, it is essential to cleanup the data after all the db and its related files are copied. For that, I have to use nodetool. Nodetool usually resides in <cassandra-path>/bin. To cleanup, give the following command:

nodetool -h <hostname of the cassandra> cleanup

Based on the amount of data to be cleaned, it will take some time.To track the progress, you can use:

nodetool compactionstats

This will print the following:

pending tasks: 1
compaction type keyspace column family completed total unit progress
Cleanup <keyspace-name> <column-family-name> 1667500496 2004577146 bytes 83.18%

Note: the total column will vary based on the size that is being calculated in real time.

2) To get the statistics of all the keyspaces. Especially to get a rough idea of how much keys per column family are being used:

nodetool -h <hostname> cfstats

Friday, December 9, 2016

MSSQL: How to transform a set of rows from one table to another?

My requirement was to transfer a specific set of rows from one table to another table and both have the exact schema bit-by-bit. In MSSQL, follow these steps:

1) First get the column list from the destination table. Use:

SELECT SUBSTRING(
(SELECT ', ' + QUOTENAME(COLUMN_NAME)
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = '<destination-table-name>'
ORDER BY ORDINAL_POSITION
FOR XML path('')),
3,
200000);

2) If the destination table has the ID column (primary key), then you need to turn on the identity_insert option:

set identity_insert <destination-table> on;

3) Then you insert...into...select query. The select can have any valid condition to get the specific set of data. A typical query can be like this:

insert into emails (<destination-table-column-list-from the query in step 1) SELECT <source-table-column-names>
FROM <source-table> where <any condition that will load only those records>;

4) Don't forget to turn off the identity_insert option you changed in step 2):

set identity_insert dbo.emails off;

And that is all...

Wednesday, November 30, 2016

Installing Jupyter Notebook and Numpy in Windows

I had hard time to install Jupyter Notebook. My machine already had Python 2.7 and after installing anaconda and trying to run 'juypter notebook' command threw python errors.

Then I tried to uninstall Python 2.7 and again let the anaconda install its own python version. No luck.

Then finally I removed all the Python installation but forgot to remove the path variables (more on this below). Then installed python 3.5 version (take the latest 3.5 release).

Then using pip3 install jupyter, the notebook got installed successfully.

Then running 'jupyter notebook', got an error message "Fatal Python error: Py_Initialize: unable to load the file system codec". Upon checking the forums, one of the solution was to check the PATH and the PYTHONPATH environment variables. Bingo..they were still referring to the older and now removed installations of Python.

Removed them and the 'jupyter notebook' brought the index page in the browser but with the 404 error.

Again, forums helped me. It looks like there may be a bug in the jupyter version I downloaded, so if you run the 'jupyter notebook' from your root folder (c:\), then this error will happen. This could be due to the fact that the root folder has hidden system files. Switch to any folder, say, c:\tmp, and run the command, brings the page with the file contents of that folder.

Numpy is bit different. Since there is no installation manager provided by numpy.org, you need to first download the binary wheel for the python version you're using. Important: Make sure you are downloading the right version of the whl file. if you're using Python 3.5, look for the one with the name that contains cp35.

Then go to that downloaded folder and run pip install <whl file name>. For example, pip3 install "numpy-1.11.2+mkl-cp35-cp35m-win32.whl". This will install the numpy package.

That's all folks!.

Wednesday, November 16, 2016

Using Jboss JConsole

If you're using JBoss as your app server, then you don't need to setup JDK settings for the JMX. Instead you need to check the following: http://stackoverflow.com/questions/17105933/enabling-jmx-remote-in-jboss-6-1

Then run JBOSS_HOME/bin/jconsole.sh (or .bat for Windows).

In the "Remote" section enter the following:service:jmx:remoting-jmx://<server-name>:9999 (the default OOTB port number is 9999. Check 'management-native' value in standalone.xml.

Then enter the username and password we created using add-user.sh (or .bat) in jboss/bin.

Tuesday, November 15, 2016

Using Jstatd

JVisuslVM is useful in profiling your server applications local or remote. To profile your application, you need to run the jstatd daemon in the server your application is running. Running jstatd is not as straightforward. Follow these steps:

1) Create a policy file. The policy tool is in JAVA_HOME/jre/bin. Start policytool.exe

2) You need to provide the security policy for the tools.jar. The tools.jar is usually in JAVA_HOME/lib. DO NOT GIVE RELATIVE PATH such as $JAVA_HOME/lib/tools.jar. The jstatd doesn't recognize it. Instead provide the absolute path.

3) Click "Add Policy Entry" and enter the following in "CodeBase" text box: file:/usr/java/default/lib/tools.jar (for non-windows and also, check your java path in your machine) or "c:/Program Files (x86)/Java/jdk1.7.0_79/lib/tools.jar" (for Windows, for ex. Use "/" for windows path also).

4) Click "Add Permission" and enter the following in "AllPermission" text box: java.security.AllPermission. Click Ok.

5) Click "Done"

6) Click File->Save and save it.

7) Copy this policy file to the remote server (if needed).

8) Go to $JAVA_HOME/lib folder.

9) Run the following: jstatd -J-Djava.security.policy=/<absolute path to the folder where the policy file is>/<policyfile name> -p1234

Note: The -p option may not required in some cases and this value is a random value chosen.

Now run the Java VisualVM and Click on Remote and add the remote server name. You should see the Jstatd process.

Monday, August 24, 2015

Switching SOLR JDK version

In some cases, your JAVA_HOME environment variable may not point to the version SOLR demands. For example, my development environment demands JDK 6, where as SOLR 5,x requires JDK 1.7 and above.

SOLR provides its own JAVA_HOME environment variable SOLR_JAVA_HOME. Set this variable to the JDK version you need to and you are all set.

In Windows, use the Control Panel->Environment Variables section to set the SOLR_JAVA_HOME.

Then run solr start -p <port-number>, usually, <port-number> = 8983.

That is all.

Wednesday, May 27, 2015

Using serialver JDK command

If you are implementing java.io.Serializable, it is a good practice to create the serialVersionUID for that class. Otherwise, when the class structure is changed (adding or removing the attributes), you will have the nightmare of handling the different serialVersionUID issues.

You can provide any random value to the serialVersionUID but it is better to let the tools generate the unique IDs for each class.

Eclipse comes with the built-in serialVersionUID option to generate the unique UID. However, if the project is not compiled successfully (typically the dependencies issues), or if you don't want Eclipse to build your project, generating the version UID is not possible.

That is where serialver command line comes to the rescue. The format is:

serialver [-classpath classpath] [-show] [className]

[-classpath classpath] - is where your class files' root folder. For ex. if you have <myfolder>/target/classes/com/mytest/MyOnlyTest.class, then your class files root folder is <myfolder>/target/classes/.

Most of the times the classpath value is required. If you run the serialver command from "<myfolder>/target/classes", the classpath value will be "./". It is always better to start from the folder where the root package is the subdirectory of that.

[-show] - This will provide UI option. This will work just like the command line.

[className] - The fully qualified class name (that is including the package name). For ex. com.mytest.MyOnlyTest

Example (Using the command line):

c:\<myfolder>\target\classes>serialver -classpath "./" com.mytest.MyOnlyTest

If the class is a valid one, you will see the serialVersionUID entry like this:

com.mytest.MyOnlyTest: static final long serialVersionUID = -7039383885342506716L

Copy that value and paste it in your java file.

Example (Using the UI):

c:\<myfolder>\target\classes>serialver -classpath "./" -show

This will display a small box with entering the full class name. Enter com.mytest.MyOnlyTest, then click "Show". If the class is a valid one, you will see the serialVersionUID entry in the "Serial Version" box.

Copy that value and paste it in your java file.

Note: If you get the <className> is not found, check whether you entered the right class path value.