Monday 23 July 2018

Hadoop admin

Hi all .

 Hadoop means a solution to big data ..

1 big data ??  What is mean big data

The data that cannot handle efficiency by
Existing systems know as big data

In old days Unix admins and SQL admins are in the market with high earnings .. now it's time for the
Hadoop admins . Let's start your career with Hadoop admin again to be in a good position in the market


Follow this for the new posts of the Hadoop admin

I hope everyone will enjoy learning Hadoop admin and earn more in this competitive world

Here I will discuss.  About

1. SQL drawbacks .
2. Hadoop importance.
3. How to get into Hadoop?
4 .What are the main concepts of Hadoop?



Thursday 10 November 2016

DATA HUB EXTENSION...........

                                                    DATAHUB
 1. what is datahub
2.why we use datahub
3. how its works

4. how to install datahub
5.how to work on datahub


 as we all are known that if we want to give any data to hybris we will go to  IMPEX INSERT   impex is one of the  way to feed data into HYBRIS system
i have one question   here  we use  IMPEX   to  feed data into hybris  then y we are goning to use DATAHUB :


answer is simple   for example   we  got a new project   it is  already running in   any one of the technology   we  wont to develop that PROJECT from the Scrap .   so just think one how many lines of IMPEX code  is require  to  feed  all the data into our HYBRIS system ....



some one said that just      just  go for SAP PI   and other  toolsyes  we can use   SAP PI   :    SAP PI   means   SAP NetWeaver Process Integration (SAP PI) is SAP's enterprise application integration (EAI) software, a component of the NetWeaver product group used to facilitate the exchange of information among a company's internal software and systems and those of external parties.
one to one ,one to many it will be easy if  many to many  many to one  type came we have to take  care of  some mapings   and we wont to get aware of  SAP ABAP
 


if  we  use DATAHUB  it is just a simple extension  we no need to take care of those things
DATAHUB  will run on tomcat server . code ing will be done in SPRING  so i Think it is reduce our work   so i will prefer that  use DATAHUB .
                                               1. what is data hub :



 data hub is one of the extension in hybris . it will run on tomcat server    the main aim of the datahub   is transfer huge  amount of data from one DB to HYRIS and HYBRIS to one DB  for example  just take  SAP HANA  db  as we all are know  that  SAP HANA  out put is in IDOC form    means  in xml format . it will human redable and very easy    DATAHUB  take this  format  as in put  and finaly after some Steps we can insert data into the HYBRIS system . we use DATAHUB  in our project to do this  thing..........                  



                              2.why we use datahub :



as i said above it is very easy to do our task  . it is very easy to understand  and  it will reduces the project  development  time . and mainly it is one of SAP product so it is belongs to sap family
The Data Hub also acts as a staging area where external data can be analyzed for errors and corrected before being fed into hybris.                                               

                               3. how its works :



DATAHUB  will take      XML format as in put  and finally we wont to convert  it into IMPEX format
it can be done in two ways  those  are bellow
Outbound: hybris -> SAP ERP
Process Engine is used for decouplingData Hub adapter transfers data to Data HubData Hub extensions define the raw data formats, the transformation to canonical data format and the target system format.SAP IDOC outbound adapter reads target system definition, creates IDOCs and sends via Spring-Integration to SAP ERP.
Inbound: SAP ERP -> hybris
IDOCs are received by SAP IDOC inbound adapter, which creates Spring-Integration messages. Messages are routed to mapping services provided by SAP extensionsMapping services are creating raw data fragments and route them to Data Hub raw inbound. Data Hub transforms raw items to canonical items and then to target them, which are published to hybris.During Impex data processing, hybris services, interceptors, translators and events are used.                                               4. how to install datahub :               



STEP 1:-


Configure Tomcat for use with the Data Hub. This involves setting up the JVM parameters. The easiest way to do this is to create a Tomcat startup script that sets the CATALINA_OPTS parameters.Typical parameters for a UNIX system are"-Xms2048m -Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+AlwaysPreTouch -XX:+DisableExplicitGC". For two examples of Tomcat startup scripts:-


1.1 Installing a Standalone Data Hub Server:-The Data Hub is not hard to install, but installation is complex. If you make a small mistake, it just won't work. So, go slow, take your time, read the instructions carefully, and follow them exactly.
1.2 Downloading and Installing the Data Hub Software:-The Data Hub is a module of the hybris Commerce Suite. To get access to the Data Hub software, go to the Download page and download the full installation of the hybris Commerce Suite. Extract the zip archives.The Commerce Suite at the hybris wiki Download page is not intended for production use. If you need a Data Hub download for production use, go to the Download Center of SAP Service Marketplace.1.3 Installation of the Data Hub on a Tomcat Server:-With your browser, connect to https://tomcat.apache.org/download-70.cgi and download the latest version of Tomcat 7.X. Tomcat 7.X is the only version supported by the Data Hub.You should create a Tomcat Server start-up file with the suggested performance configuration.For example:-datahubserver.sh or datahubserver.bat, which should be used to start a Tomcat server for use with the Data Hub. To apply the VM options of your choice to the Data Hub server environment, you need to set the CATALINA_OPTS system variable, either in the user profile (UNIX) or System Environment Variables (Windows).
Here are some examples of such a script: datahubserver.sh {for unix}
#!/bin/bash
# resolve links - $0 may be a softlink
PRG="$0"
while [ -h "$PRG" ]; do
ls=`ls -ld "$PRG"`
link=`expr "$ls" : '.*-> \(.*\)$'`
if expr "$link" : '/.*' > /dev/null; then
PRG="$link"
else
PRG=`dirname "$PRG"`/"$link"
fi
done
# Get standard environment
PRGDIR=`dirname "$PRG"`
# Explanation of settings:
# * Set the minimum memory to 2gb
# * Set the maximum memory to 4gb
# * Use the ParNew garbage collector for the young generation heap
# * Use the ConcurrentMarkSweep garbage collector for the old generation heap
# * Tell the JVM to touch all memory pages during JVM initialization
# * Disable explict garbage collection (i.e., via the System.gc() method)
export CATALINA_OPTS="-Xms2048m -Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+AlwaysPreTouch -XX:+DisableExplicitGC"
# Run Tomcat in the foreground
$PRGDIR/catalina.sh run
ordatahubserver.bat {For windows}
rem @echo off
rem Explanation of settings in CATALINA_OPTS:
rem * Set the minimum memory to 2gb
rem * Set the maximum memory to 4gb
rem * Use the ParNew garbage collector for the young generation heap
rem * Use the ConcurrentMarkSweep garbage collector for the old generation
rem     heap
rem * Tell the JVM to touch all memory pages during JVM initialization
rem * Disable explict garbage collection (i.e., via the System.gc() method)
set CATALINA_OPTS=-Xms4096m -Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+AlwaysPreTouch -XX:+DisableExplicitGC
setlocal
if not ""%1"" == ""run"" goto mainEntry
if "%TEMP%" == "" goto mainEntry
if exist "%TEMP%\%~nx0.run" goto mainEntry
echo Y>"%TEMP%\%~nx0.run"
if not exist "%TEMP%\%~nx0.run" goto mainEntry
echo Y>"%TEMP%\%~nx0.Y"
call "%~f0" %* <"%TEMP%\%~nx0.Y"
rem Use provided errorlevel
set RETVAL=%ERRORLEVEL%
del /Q "%TEMP%\%~nx0.Y" >NUL 2>&1
exit /B %RETVAL%
:mainEntry
del /Q "%TEMP%\%~nx0.run" >NUL 2>&1
rem Guess CATALINA_HOME if not defined
set "CURRENT_DIR=%cd%"
echo "CATALINA_HOME=%CATALINA_HOME%"
if not "%CATALINA_HOME%" == "" goto eof
set "CATALINA_HOME=%CURRENT_DIR%"
if exist "%CATALINA_HOME%\bin\catalina.bat" call "%CATALINA_HOME%\bin\catalina.bat" run
:eof
The VM options shown above are just an example and should not be used "as is" in the production environment. You need to determine your own "best" configuration. 
 
1.5 Deploying Data Hub on Tomcat:-Data Hub can be deployed to Tomcat like any other WAR file. One option is to simply rename the Data Hub WAR file to datahub-webapp.war and copy it to the Tomcat webapps directory. The next time Tomcat starts, it explodes the WAR file automatically. Alternatively, you can use a context.xml file to customize the location and context path of the Data Hub WAR file. Using a custom context.xml is optional, but a context.xml file is the preferred method, because it keeps your Tomcat webapps directory unencumbered. You can find a full explanation how to use Tomcat's context.xml file at https://tomcat.apache.org/tomcat-7.0-doc/config/context.html.
If, for whatever reason, you want to delete the WAR file, you have to stop Tomcat first.
If you are deploying a new instance of datahub-webapp.war, you must manually delete the old exploded version contained in $CATALINA_BASE/webapps.
1.6 Configuring the Location of the local.properties File:-The local.properties file is used by the Data Hub to store configuration properties, and it must either exist in the classpath of Tomcat or be pointed to in the Tomcat context.xml file.
1.7 Deploying Data Hub Extensions on Tomcat
Data Hub extension JARs must be deployed so they are available to the web application class loader. One option is to copy extension JARs into the \WEB-INF\lib directory of the exploded Data Hub WAR file. Alternatively, you can use the context.xml file to add additional directories to be scanned for resources by the class loader. This allows you to keep your Data Hub extensions in a directory outside of the exploded Data Hub WAR.
Sample context.xml
In the example below, you create a datahub-webapp.xml and copy it to the $CATALINA_BASE/conf/<enginename>/<hostname> directory. Naming the file datahub-webapp.xml sets the context path to /datahub-webapp. You configure two directories in the virtualClasspath attribute:
·         config-dir is the directory where the local.properties file is placed
·         datahub-extensions is the directory where Data Hub extensions are placed


datahub-webapp.xml
<Context antiJARLocking="true"
docBase="/path/to/datahub/war/datahub-webapp-5.x.x.x-RCx.war"
reloadable="true">
<Loader className="org.apache.catalina.loader.VirtualWebappLoader"
virtualClasspath=
"/path/to/datahub/config-dir/;
/path/to/datahub/extensions-dir/*.jar" />
</Context>
There is one additional update needed regarding the uploaded datahub-webapp web application. You need to install the following extension jar files, while being careful to select versions compatible with your Data Hub version:
·         datahub-cleanup-5.7.0.2-RC2.jar (used to automatically cleanup the extraneous records in the Data Hub database - 5.7 only - refer toAutomated Elimination of Data Hub Auditing Related Database Records - 5.7)
·         csv-web-service-X.jar (for versions prior to 5.6)
·         pcm-apparel-raw-X.jar
·         pcm-apparel-canonical-X.jar
·         pcm-apparel-target-X.jar
These files are available from the Data Hub Solution Book. You need to restart the Tomcat server after installing any extension files.
You are going to configure other options that affect Data Hub. Tomcat needs to be restarted for each of these, so it is best to wait until all installation/configuration steps are complete before starting Tomcat to launch Data Hub.
 
 
STEP2:-
 
Copy the Data Hub webapp war file from %hybris-home%/hybris/bin/ext-integration/datahub to the Tomcat webapps directory and start Tomcat. Stop Tomcat.
 
 
 
    STEP3-


  

Select the relational database you are going to use with Data Hub and configure Data Hub for it.
By default, when the datahub-webapp.war file is deployed, the application is configured to use the HSQL database. However, HSQL is not a supported database for production deployments.
By default, the Data Hub relies on a database instance with the name of 'integration' with an administrative user named 'hybris' with the password 'hybris'. The database needs to be created by the administrative user ('root') logging into the system. The 'hybris' user needs DBA rights and full schema privileges to the 'integration' database instance. You can use the following SQL statements to recreate this database between Data Hub runs to refresh the Data Hub database.
drop database integration;
create database integration;
By default, the kernel.autoInitMode is set to create-drop. If you are in a production environment, hybris recommends the kernel.autoInitMode property be set to update instead.
You need to configure your local.properties file for the appropriate database. To find the correct parameters for your database, see the associated database


You need to add the MySQL database driver to the Tomcat server's classpath. For example, mysql-connector-java-5.1.x-bin.jar would be placed in %TOMCAT_HOME%\lib directory.
After the Data Hub application is deployed in the web server, there is a module.properties file in the apache-tomcat-<version>/webapps/datahub-webapp/WEB-INF/classes directory. The values in the file can be superseded by a local.properties file, which you create to reflect your local Data Hub setup. The content of the local.properties file should reflect your database choice as is shown
 
in the example for MySQL below:-
local.properties Version 5.3 and 5.4
dataSource.driverClass=com.mysql.jdbc.Driver
dataSource.jdbcUrl=jdbc:mysql://localhost/integration?useConfigs=maxPerformance
dataSource.username=...
dataSource.password=...
#media storage
mediaSource.driverClass=com.mysql.jdbc.Driver
mediaSource.jdbcUrl=jdbc:mysql://localhost/integration?useConfigs=maxPerformance
mediaSource.username=...
mediaSource.password=...
local.properties Version 5.5 through 5.6
dataSource.className=com.mysql.jdbc.jdbc2.optional.MysqlDataSource
dataSource.jdbcUrl=jdbc:mysql://localhost/integration?useConfigs=maxPerformance
dataSource.username=...
dataSource.password=...

#media storage
media.dataSource.className=com.mysql.jdbc.jdbc2.optional.MysqlDataSource
media.dataSource.jdbcUrl=jdbc:mysql://localhost/integration?useConfigs=maxPerformance
media.dataSource.username=...
media.dataSource.password=...
local.properties Version 5.7 and higher


dataSource.className=com.mysql.jdbc.jdbc2.optional.MysqlDataSource

dataSource.jdbcUrl=jdbc:mysql://localhost/integration?useConfigs=

maxPerformance&rewriteBatchedStatements=true

dataSource.username=...

dataSource.password=...

You need to restart the Tomcat server for the changes to take effect.

By default, MySQL performs case-insensitive queries, which may be an issue in some cases. 

Case sensitivity is set using the collate parameter. To enable case-sensitive queries in the Data Hub, 

the schema created needs to be configured as below. Also, MySQL does adopt the Operating System case

 settings in some cases, such as Linux.

 

Collate Parameter used in a Schema Command

CREATE SCHEMA `integration`

DEFAULT CHARACTER SET utf8

COLLATE utf8_bin ;

You can use the following SQL statements to recreate this database between Data Hub runs to refresh the Data Hub database.

drop dat    abase integration;

create d     atabase integration;

Step4:      

 By going through the following steps, you can configure the Data Hub to use encryption for secure attributes:

1.     Generate an encryption key

2.     Store the key in a file on the classpath or file system and set the datahub.encryption.key.path property   

in the local.properties file    

4.     5.1Generate an Encryption Key:-The Data Hub uses AES/ECB/PKCS5Padding symmetric key

 encryption with 

128-bit key by default. There are many different ways you can generate an AES key. 

If you have openSSL installed on your machine, you can generate a 128-bit AES key using the following terminal command:

openssl enc -aes-128-cbc -k secret -P -md sha1

You see the following sample result:

salt=A941C164EBA24ADC

key=6F94977631C9CE1BADA1EA7B8AC609B4

iv =9FC846FA06ABD024181D3A2B2977B5FE

You are only interested in the value for key: 6F94977631C9CE1BADA1EA7B8AC609B4

Store the Key in a File and Set the datahub.encryption.key.path Property


Once you have generated a secret key, save it to a file on the classpath or file system. 


You must specify the path to the file that contains the key by setting the datahub.encryption.key.path property in

 $TOMCAT_HOME/lib/local.properties file.


 A relative path is enough if the file is on the classpath. An absolute path is required if the file is on the file system

 but not on the classpath. For example, if the file is on the classpath:


datahub.encryption.key.path=path/to/encryption-key.txt

or, if the file is not on the classpath:

datahub.encryption.key.path=/Users/userhome/path/to/encryption-key.txt

5.2Secured Attribute Value Masking:-Secured attribute values are masked when returned in

 REST API responses and in logging. By default, secured attribute values are replaced with "*******".


Secured attribute value masking can be configured with the following properties:

# enable/disable secured attribute value masking

datahub.secure.data.masking.mode=true

# set the masking value

datahub.secure.data.masking.value=*******

The D

STEP 6:-Data Hub Adapter Extension


The Data Hub Adapter is a Core Platform extension that links the hybris platform to the Data Hub.

The Data Hub Adapter extension is found in the %HYBRIS_HOME%/bin/ext-integration directory. You should specify it as one of the extensions in the %HYBRIS/config/localextensions.xml file:
<extensions>
<!- .... -->
<extension dir="${HYBRIS_BIN_DIR}/ext-integration/datahubadapter" />
</extensions>
You do not need to run a platform update for this change. You do need to run ant clean all from the command line to update your hybris Commerce Suite.
Then you need to update the $TOMCAT_HOME/lib/local.properties file. The local.properties file should contain the following statements.
datahub.extension.exportURL=http://localhost:9001/datahubadapter
datahub.extension.userName=admin
datahub.extension.password=nimda
Once the local.properties file is in place, you can start your hybris Commerce Suite server.
More detailed information regarding this step can be found in the Data Hub Adapter Extension.
Step7:-
Start Tomcat.
Using a REST browser tool such as Postman, issue the following command:
GET - http://localhost:8080/datahub-webapp/v1/data-feeds/DEFAULT_FEED/
You should get a 200 Success message in response, which shows the Data Hub is running properly.
 
 

for more information just go to hybris wiki