Kasun's Blog: 2012

Tuesday, October 23, 2012

Configuring Hive metastore to remote database - WSO2 BAM2

Hive Metastore

Hive metastore is the central repository which is used to store Hive metadata. We use embedded H2 database as the default hive metastore. Therefore only one hive session can access the metastore.

Using remote MYSQL database as Hive metastore.

You can configure hive metastore to MYSQL database as follows.

Edit hive-site.xml located at WSO2_BAM2_HOME/repository/conf/advanced/ directory.

<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:mysql://localhost/test_database</value>
  <description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>com.mysql.jdbc.Driver</value>
  <description>Driver class name for a JDBC metastore</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionUserName</name>
  <value>root</value>
  <description>username to use against metastore database</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionPassword</name>
  <value>root</value>
  <description>password to use against metastore database</description>
</property>

Put MYSQL driver into WSO2_BAM2_HOME/repository/components/lib

Now You have successfully configured the hive metastore to MYSQL database. Now restart the BAM server.

Saturday, October 6, 2012

A Fix for Huawei E220 connection issue with ubuntu 12.04

After installing Ubuntu 12.04, I faced an issue when connecting to the internet from my Huawei E220 dongle. So I did some google search and found a bug report relating this[1]. After going through this issue I found a workaround which fix the issue.

This is the workaround.

You should execute following command as root.

echo -e "AT+CNMI=2,1,0,2,0\r\nAT\r\n" > /dev/ttyUSB1

Now try to connect your dongle again, it works for me until dongle is removed from USB port. Thanks Nikos for your workaround :)

[1] https://bugs.launchpad.net/ubuntu/+source/modemmanager/+bug/868034

Saturday, September 8, 2012

WSO2 Business Activity Monitor 2.0.0 released ....!!!!

We spent almost year for releasing the WSO2 BAM 2.0.0 after completely re-writing it twice from BAM 1.x.x to BAM 2.0.0 according to new architecture, suggestions and improvements. Finally we released it today, below you can see the release note of the BAM 2.0.0 :)

WSO2 Business Activity Monitor 2.0.0 released!

The WSO2 Business Activity Monitor (WSO2 BAM) is an enterprise-ready, fully-open source, complete solution for aggregating, analyzing and presenting information about business activities. The aggregation refers to collection of data, analysis refers to manipulation of data in order to extract information, and presentation refers to representing this data visually or in other ways such as alerts. The WSO2 BAM architecture reflects this natural flow in its design.

Since all WSO2 products are based on the component-based WSO2 Carbon platform, WSO2 BAM is lean, lightweight and consists of only the required components for efficient functioning. It does not contain unnecessary bulk, unlike many over-bloated, proprietary solutions. WSO2 BAM comprises of only required modules to give the best of performance, scalability and customizability, allowing businesses to achieve time-effective results for their solutions without sacrificing performance or the ability to scale.

The product is available for download at: http://wso2.com/products/business-activity-monitor

The documentation is available at: http://docs.wso2.org/wiki/display/BAM200/WSO2+Business+Activity+Monitor+Documentation

Key Features

Collect & Store any Type of Business Events
- Events are named, versioned and typed by event source
- Event structure consists of (name, value) tuples of business data, metadata and correlation data
High Performance Data Capture Framework
- High performance, low latency API for receiving large volumes of business events over various transports including Apache Thrift, REST, HTTP and Web services
- Scalable event storage into Apache Cassandra using columns families per event type
- Non-blocking, multi-threaded, low impact Java Agent SDK for publishing events from any Java based system
- Use of Thrift, HTTP and Web services allows event publishing from any language or platform
- Horizontally scalable with load balancing and high available deployment
Pre-Built Data Agents for all WSO2 Products
- Service Data Agent for all service hosting products including WSO2 Application Server, Business Process Server, Data Services Server, Enterprise Service Busand Business Rules Server
- Mediation Data Agent for WSO2 Enterprise Service Bus
- Reusable service data and mediation data model for integrating with other service hosting and mediation systems
Scalable Data Analysis Powered by Apache Hadoop
- SQL-like flexibility for writing analysis algorithms via Apache Hive
- Extensibility via analysis algorithms implemented in Java
- Schedulable analysis tasks
- Results from analysis can be stored flexibly, including in Apache Cassandra, a relational database or a file system
Powerful Dashboards and Reports
- Tools for creating customized dashboards with zero code
- Ability to write arbitrary dashboards powered by Google Gadgets and {JaggeryJS}

Installable Toolboxes
- Installable artifacts to cover complete use cases
- One click install to deploy all artifacts for a use case

Issues Fixed in This Release

All fixed issues have been recorded at - http://bit.ly/Tzb1VP

Known Issues in This Release

All known issues have been recorded at - http://bit.ly/TzberZ

Engaging with Community

Mailing Lists

Join our mailing list and correspond with the developers directly.

Developer List : dev@wso2.org | Subscribe | Mail Archive
User List : user@wso2.org | Subscribe | Mail Archive

Reporting Issues

WSO2 encourages you to report issues, enhancements and feature requests for WSO2 BAM. Use the issue tracker for reporting issues.

Discussion Forums

We encourage you to use stackoverflow (with the wso2 tag) to engage with developers as well as other users.

Training

WSO2 Inc. offers a variety of professional Training Programs, including training on general Web services as well as WSO2 Business Activity Monitor and number of other products. For additional support information please refer to http://wso2.com/training/

Support

We are committed to ensuring that your enterprise middleware deployment is completely supported from evaluation to production. Our unique approach ensures that all support leverages our open development methodology and is provided by the very same engineers who build the technology.
For additional support information please refer tohttp://wso2.com/support/
For more information on WSO2 BAM, and other products from WSO2, visit the WSO2 website.

We welcome your feedback and would love to hear your thoughts on this release of WSO2 BAM.
The WSO2 BAM Development Team

Sunday, June 10, 2012

JDBC Storage Handler for Hive

I was able to complete the implementation of Hive JDBC Storage Handler with basic functionality. Therefore I thought to write a blog post describing the usage with some sample queries. Currently It supports writing into any database and reading from major databases (MySql, MsSql, Oracle, H2, PostgreSQL). This feature comes with WSO2 BAM 2.0.0 release.

Setting up the BAM to use Hive jdbc-handler.

Please add your jdbc-driver to $BAM_HOME/repository/component/lib directory, before starting the server.

Web UI for executing Hive queries.

BAM2 comes with a web ui for executing the Hive queries. Also there is a option to schedule the script

User interface for writing Hive Queries

User interface for scheduling hive script

Sample on writing analyzed data into JDBC

Here I am going to demonstrate the functionality of writing the analyzed data into JDBC storage. In this simple example, We'll fetch records from a file then analyze it using hive and finally store those analyzed data into MySQL database.

Records - These are the records that we are going to analyze.

bread   12      12/01/2012

sugar   20      12/01/2012

milk    5       12/01/2012

tea     33      12/01/2012

soap    10      12/01/2012

tea     9       13/01/2012

bread   21      13/01/2012

sugar   9       13/01/2012

milk    14      13/01/2012

soap    8       13/01/2012

biscuit 10      14/01/2012

Hive Queries

//drop tables if already exist

drop table productTable;

drop table summarizedTable;

CREATE TABLE productTable (product STRING, noOfItems INT, dateOfSold STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';

//Load the file with above records

load data local inpath '/opt/sample/data/productInfo.txt' into table productTable;

CREATE EXTERNAL TABLE IF NOT EXISTS

summarizedTable( product STRING, itemsSold INT)

STORED BY 'org.wso2.carbon.hadoop.hive.jdbc.storage.JDBCStorageHandler'

    TBLPROPERTIES (

                'mapred.jdbc.driver.class' = 'com.mysql.jdbc.Driver',

                'mapred.jdbc.url' = 'jdbc:mysql://localhost/test',

                'mapred.jdbc.username' = 'username',

                'mapred.jdbc.password' = 'password',

                'hive.jdbc.update.on.duplicate'= 'true',

                'hive.jdbc.primary.key.fields'='product',

                'hive.jdbc.table.create.query' = 'CREATE TABLE productSummary (product VARCHAR(50) NOT NULL PRIMARY KEY, itemsSold INT NOT NULL)');

insert overwrite table summarizedTable SELECT product, sum(noOfItems) FROM productTable GROUP BY product;

View the result in mysql.

mysql> select * from productSummary;
+---------+-----------+
| product | itemsSold |
+---------+-----------+
| biscuit |        10 |
| bread   |        33 |
| milk    |        19 |
| soap    |        18 |
| sugar   |        29 |
| tea     |        42 |
+---------+-----------+
6 rows in set (0.00 sec)

Detail description on TBLPROPERTIES in storage handler.

Property name	Required	Detail
mapred.jdbc.driver.class	Yes	The classname for the JDBC Driver to use. This should be available on Hive's classpath.
mapred.jdbc.url	Yes	The connection url for the database.
mapred.jdbc.username	No	The database username, if it's required.
mapred.jdbc.password	No	The database Password, if it's required.
hive.jdbc.table.create.query	No	If table already exist in the database, then you don't need this. Otherwise you should provide the sql query for creating the table in the database.
mapred.jdbc.output.table.name	No	The name of the table in the database. It does not have to be the same as the name of the table in Hive. If you have specified the sql query for creating the table, handler will pick the table name from query. Otherwise you need to specify this if your meta table name is different from the table in database.
hive.jdbc.primary.key.fields	Yes	If you have any primary keys in the database table
hive.jdbc.update.on.duplicate	No	Expected values are either "true" or "false". If "true" then the storage handler will update the records with duplicate keys. Otherwise it will insert all data.
hive.jdbc.output.upsert.query	No	This can be use to optimize the update operation. The default implementation is to use insert or update statement after the select statement. So there will be two database round trips. But we can reduce it to one by using db specific upsert statement. Example query for mysql database is 'INSERT INTO productSummary (product, itemsSold) values (?,?) ON DUPLICATE KEY UPDATE itemsSold=?'
hive.jdbc.upsert.query.values.order	No	If you are using an upsert query then this is mandatory. sample values for above query will be 'product,itemsSold,itemsSold' //values order for each question mark
hive.jdbc.input.columns.mapping	No	This is mandatory if your field names in meta table and database tables are different. Provide the field names in database table in the same order as the field names in meta table with ',' separated values. example: productNames,noOfItemsSold. These will map to your meta table with product,itemsSold field names.
mapred.jdbc.input.table.name	No	Used when reading from a database table. This is needed if the meta table name and database table name are different.

Sample on reading from JDBC.

Now I am going to read the previously saved records from mysql using hive jdbc-handler.

Hive queries

//drop table if already exists

drop table savedRecords;

CREATE EXTERNAL TABLE IF NOT EXISTS savedRecords( product STRING, itemsSold INT)

STORED BY 'org.wso2.carbon.hadoop.hive.jdbc.storage.JDBCStorageHandler'

             TBLPROPERTIES (

                    'mapred.jdbc.driver.class' = 'com.mysql.jdbc.Driver',

                    'mapred.jdbc.url' = 'jdbc:mysql://localhost/test',

                    'mapred.jdbc.username' = 'username',

                    'mapred.jdbc.password' = 'password',

                    'mapred.jdbc.input.table.name' = 'productSummary');

SELECT product,itemsSold FROM savedRecords ORDER BY itemsSold;

This will give all the records in the productSummary table.

Sunday, April 29, 2012

How to remote debug Apache Cassandra standalone server

In order to debug the cassandra server from your favorite IDE. You need to add the following into cassandra-env.sh located in apache-cassandra-1.1.0/conf directory.

JVM_OPTS="$JVM_OPTS -Xdebug"
JVM_OPTS="$JVM_OPTS -Xnoagent"
JVM_OPTS="$JVM_OPTS -Djava.compiler=NONE"
JVM_OPTS="$JVM_OPTS -Xrunjdwp:transport=dt_socket,server=y,address=5005,suspend=n"

cassandra-env.sh

After adding this, once you start the server you can see the following line printed in cassandra console

"Listening for transport dt_socket at address: 5005"

This the port that you specified in JAVA_OPTS. You can change it to some other value as you want.

Now configure your IDE to run on debug mode.

Now you can debug the apache cassandra server from your favorite IDE :)

Sunday, March 18, 2012

WSO2 BAM 2.0.0-Alpha 2 released..!!!!

After working hard on releasing BAM-2.0.0 Alpha2, We were able to release it on 13th March 2012.

This is the release note :

WSO2 team is pleased to announce the release of version 2.0.0 - ALPHA 2 of WSO2 Business Activity Monitor.

WSO2 Business Activity Monitor (WSO2 BAM) is a comprehensive framework designed to solve the problems in the wide area of business activity monitoring. WSO2 BAM comprises of many modules to give the best of performance, scalability and customizability. These allow to achieve requirements of business users, dev ops, CxOs without spending countless months on customizing the solution without sacrificing performance or the ability to scale.

WSO2 BAM is powered by WSO2 Carbon, the SOA middleware component platform.

Downloads

The binary distribution can be downloaded at http://dist.wso2.org/products/bam/2.0.0-alpha2/wso2bam-2.0.0-ALPHA2.zip.

The documentation pack is available at http://dist.wso2.org/products/bam/2.0.0-alpha2/wso2bam-2.0.0-ALPHA2-docs.zip.

Samples

Service Data Agent - Sample to install Service data agent, publish statistics and intercepted message activity from Service Hosting WSO2 Servers such as WSO2 AS, DSS, BPS, CEP, BRS and any other WSO2 Carbon server with the service hosting feature
Mediation Data Agent - Sample to install Mediation data agent, publish mediation statistics and intercepted message activity using Message Activity Mediators from the WSO2 ESB
Data center wide cluster monitoring - Sample to simulate two data centers each having two clusters sending statistics events, perform summarizations and visualize them in a dashboard
End - End Message Tracing - Sample to simulate messages fired from a set of servers to WSO2 BAM and set up message tracing analytics and visualizations of respective messages
KPI Definition - Sample to simulate receiving events from a server (ex: WSO2 AS), perform summarizations and visualize product and consumer data in a retail store
Fault Detection & Alerting - Sample to simulate receiving events from a server (ex: WSO2 ESB), detect faults and fire email alerts

Features

Data Agents

Pre built data agents - Service Data Agent for the WSO2 AS, DSS, BPS, CEP, BRS and any other WSO2 Carbon server with the service hosting feature and Mediation Data Agent for the WSO2 ESB
A re-usable Agent API to publish events to the BAM server from any application (samples included)
Apache Thrift based Agents to publish data at extremely high throughput rates
Option to use Binary or HTTP protocols

Event Storage

Apache Cassandra based scalable data architecture for high throughput of writes and reads
Carbon based security mechanism on top of Cassandra

Analytics

An Analyzer Framework with the capability of writing and plugging in any custom analysis tasks
Built in Analyzers for common operations such as get, put aggregate, alert, fault detection, etc.
Scheduling capability of analysis tasks

Visualization

Drag and drop gadget IDE to visualize analyzed data with zero code
Capability to plug in additional UI elements and Data sources to Gadget IDE
Google gadgets based dashboard

Reporting Issues

WSO2 encourages you to report issues, enhancements and feature requests for WSO2 BAM. Use the issue tracker for reporting any of these.

Sunday, February 26, 2012

Setting up a Cassandra cluster using wso2 carbon

If you want to use WSO2 security model with Cassandra cluster here I'll show you, how you can setup a cassandra cluster using wso2 carbon.

First you need to download wso2 carbon (I am using version 3.2.2)

Then install cassandra feature by using the p2 repository from http://dist.wso2.org/p2/carbon/releases/3.2.2/ to wso2 carbon server.

This will install Cassandra 0.7 version to your carbon server.

Adding p2 repository (http://dist.wso2.org/p2/carbon/releases/3.2.2/)

Installing Cassandra 3.2.2 feature

After finishing the installation restart the carbon server. Now carbon server will work as your Cassandra server.

setup few more cassandra nodes using wso2 carbon as above according to your requirement.
You can follow the instruction given by this site for setting up the cassandra cluster.
The cassandra.yaml configuration file is located in $wso2carbon_home/repository/conf/advanced/ directory.

Add following configuration file (cassandra-auth.xml) $wso2carbon_home/repository/conf/advanced/ in order to view keyspaces using Cassandra Keyspaces ui (change the username and password accordingly).

<Cassandra>

   <EPR>https://localhost:9443/services/CassandraSharedKeyPublisher</EPR>

   <User>USERNAME</User>

   <Password>PASSWD</Password>

</Cassandra>

cassandra-auth.xml

Cassandra Keyspaces ui

Once you finish the configuration. You can check the status of the cluster by using Cassandra cluster ui or else You can use nodetool comes with Apache Cassandra to monitor the cluster.

Cassandra cluster monitor ui

nodetool

$./nodetool -h 192.168.0.100 -p 9999 ring -u admin -pw admin

Address Status State Load Owns Token
113427455640312821154458202477256070485
192.168.0.100 Up Normal 20.36 MB 33.33% 0
192.168.0.101 Up Normal 251.64 MB 33.33% 56713727820156410577229101238628035242
192.168.0.102 Up Normal 20.95 MB 33.33% 113427455640312821154458202477256070485

note: remote jmx agent port number in carbon server is 9999 + offset (default offset in carbon.xml is 0)

Thursday, February 16, 2012

Fixing ADB databinding issue when web service method returning OMElement

When I try to call a web service which return an OMElement, I faced above issue (My Axis2 version is 1.6.1). Following I have shown the steps that I did for fixing the issue.

This is the part of the stack trace.

org.apache.axis2.AxisFault: org.apache.axis2.databinding.ADBException: Any type  element type has not been given

    at org.apache.axis2.AxisFault.makeFault(AxisFault.java:430)

    at org.wso2.carbon.bam.presentation.stub.QueryServiceStub.fromOM(QueryServiceStub.java:8908)

    at org.wso2.carbon.bam.presentation.stub.QueryServiceStub.queryColumnFamily(QueryServiceStub.java:800)

    at org.wso2.carbon.bam.clustermonitor.ui.ClusterAdminClient.getClusterStatistics(ClusterAdminClient.java:148)

If you check the schema of the response element in your generated wsdl(by axis2) it should similar to this.

<xs:element name="queryColumnFamilyResponse">

      <xs:complexType>

         <xs:sequence>

                <xs:element minOccurs="0" name="return" nillable="true" type="xs:anyType" />

         </xs:sequence>

      </xs:complexType>

</xs:element>

In order to fix the ADB databinding issue you need to change the above schema as follows and regenerate the stub code.

<xs:element name="queryColumnFamilyResponse">

     <xs:complexType>

          <xs:sequence>

              <xs:any processContents="skip"/>

          </xs:sequence>

     </xs:complexType>

</xs:element>

Then ADB will generate code that represents the content of OriginalMessage as an OMElement and this will fix your problem.

Tuesday, October 23, 2012

Saturday, October 6, 2012

Saturday, September 8, 2012

Key Features

Collect & Store any Type of Business Events

High Performance Data Capture Framework

Pre-Built Data Agents for all WSO2 Products

Scalable Data Analysis Powered by Apache Hadoop

Powerful Dashboards and Reports

Installable Toolboxes

Engaging with Community

Mailing Lists

Reporting Issues

Discussion Forums

Training

Support

Sunday, June 10, 2012

Setting up the BAM to use Hive jdbc-handler.

Web UI for executing Hive queries.

Sample on writing analyzed data into JDBC

Hive Queries

Detail description on TBLPROPERTIES in storage handler.

Sample on reading from JDBC.

Hive queries

Sunday, April 29, 2012

Sunday, March 18, 2012

Features

Reporting Issues

Sunday, February 26, 2012

Thursday, February 16, 2012

bloglog.com

blog