Limit Search To Specific Directory Level Using mindepth and maxdepth

博客分类：

linux

Limit Search To Specific Directory Level Using mindepth and maxdepthFind the passwd file under all sub-directories starting from root directory.# find / -name passwd./usr/share/doc/nss_ldap-253/pam.d/passwd./usr/bin/passwd

2012-07-23 09:58
浏览 855
评论(0)
分类:操作系统

HDFS架构简介

博客分类：

hadoop

转自：http://asyty-cp.blog.163.com/blog/static/117542439201191322858356/ 一、 HDFS框架简述图1 HDFS框架图

2012-07-17 15:02
浏览 1615
评论(0)
分类:开源软件

Linux rpm 命令参数使用详解［介绍和应用］

博客分类：

linux

RPM是RedHat Package Manager（RedHat软件包管理工具）类似Windows里面的“添加/删除程序” rpm 执行安装包二进制包（Binary）以及源代码包（Source）两种。二进制包可以直接安装在计算机中，而源代码包将会由RPM自动编译、安装。源代码包经常以src.rpm作为后缀名。常用命令组合：－ivh：安装显示安装进度--install--verbose--hash－Uvh：升级软件包--Update；－qpl：列出RPM软件包内的文件信息[Query Package list]；－qpi：列出RPM软件包的描述信息[Query Pack ...

2012-07-16 13:33
浏览 648
评论(0)
分类:操作系统

查看linux版本

博客分类：

linux

如何得知自己正在使用的linux是什么版本呢，下面的几种方法将给你带来答案！　　1. 查看内核版本命令：　　1) [root@q1test01 ~]# cat /proc/version 　　 Linux version 2.6.9-22.ELsmp (bhcompile@crowe.devel.redhat.com) (gcc version 3.4.4 ...

2012-07-16 13:27
浏览 692
评论(0)
分类:操作系统

Hadoop Dont's: What not to do to harvest Hadoop's full potential

博客分类：

hadoop

We've all heard this story. All was fine until one day your boss heard somewhere that Hadoop and No-SQL are the new black and mandated that the whole company switch over whatever it was doing to the Hadoop et al. technology stack, because that's the only way to get your solution to scale to web pr ...

2012-07-15 15:45
浏览 694
评论(0)
分类:开源软件

hadoop-0.20.203启用LZO压缩安装成功

博客分类：

hadoop

#准备各安装包，并scp到各节点 pwd /work/lzo #scp ./* node-host:/work/lzo ls -l 总计 3240 -rw-r--r-- 1 root root 2176215 07-13 16:12 hadoop-gpl-packaging-0.2.8-1.x86_64.rpm drwxr-xr-x 13 root root 4096 07-13 16:23 lzo-2.06 -rw-r--r-- 1 root root 141887 07-13 16:12 lzo-2.06-1.el5.rf.x86_64.rpm -rw-r--r ...

2012-07-13 18:10
浏览 2604
评论(0)
分类:开源软件

Hadoop TaskScheduler浅析

博客分类：

hadoop

转自：http://hi.baidu.com/_kouu/blog/item/f51e57dc73d42d2a5982dd8a.html TaskScheduler，顾名思义，就是MapReduce中的任务调度器。在MapReduce中，JobTracker接收JobClient提交的Job，将它们按InputFormat的划分以及其他相关配置，生成若干个Map和Reduce任务。然后，当一个TaskTracker通过心跳告知JobTracker自己还有空闲的任务Slot时，JobTracker就会向其分派任务。具体应该分派一些什么样的任务给这台TaskTracker，这就是TaskSc ...

2012-07-13 14:01
浏览 895
评论(0)
分类:开源软件

Hadoop OutputFormat浅析

博客分类：

hadoop

转自：http://hi.baidu.com/_kouu/blog/item/dd2f08fd25da09e0fc037f15.html 在Hadoop中，OutputFormat和InputFormat是相对应的两个东西。相比于InputFormat，OutputFormat似乎没有那么多细节。InputFormat涉及到对输入数据的解析和划分，继而影响到Map任务的数目，以及Map任务的调度（见《Hadoop InputFormat浅析》）。而OutputFormat似乎像其字面意思那样，仅仅是完成对输出数据的格式化。对于输出数据的格式化，这个应该没什么值得多说的。根据需要， ...

2012-07-13 14:00
浏览 2167
评论(0)
分类:开源软件

Hadoop InputFormat浅析

博客分类：

hadoop

在执行一个Job的时候，Hadoop会将输入数据划分成N个Split，然后启动相应的N个Map程序来分别处理它们。数据如何划分？Split如何调度（如何决定处理Split的Map程序应该运行在哪台TaskTracker机器上）？划分后的数据又如何读取？这就是本文所要讨论的问题。先从一张经典的MapReduce工作流程图出发：1、运行mapred程序；2、本次运行将生成一个Job，于是JobClient向JobTracker申请一个JobID以标识这个Job；3、JobClient将Job所需要的资源提交到HDFS中一个以JobID命名的目录中。这些资源包括JAR包、配置文件、InputSplit ...

2012-07-13 13:57
浏览 893
评论(0)
分类:开源软件

hadoop面试可能遇到的问题

博客分类：

hadoop

Q1. Name the most common InputFormats defined in Hadoop? Which one is default ? Following 2 are most common InputFormats defined in Hadoop - TextInputFormat

2012-07-13 13:50
浏览 1148
评论(0)
分类:开源软件

hadoop-0.20.203启用LZO压缩

博客分类：

hadoop

1.准备工作，安装ant，（编译第三步lzo编码解码时使用，现使用hadoop-lzo-package，可忽略） #创建临时目录,如当前工作路径为/work cd /work mkdir lzo #ant.apache.org 下载ant cd lzo wget http://archive.apache.org/dist/ant/binaries/apache-ant-1.8.2-bin.tar.gz cd /usr/local tar -zxvf /work/lzo/apache-ant-1.8.2-bin.tar.gz #添加ant 环境变量 echo 'ex ...

2012-07-12 16:45
浏览 1494
评论(0)
分类:开源软件

查看Linux是32位还是64位

博客分类：

linux

查看linux机器是32位还是64位的方法： file /sbin/init 或者 file /bin/ls/sbin/init: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, stripped 如果显示 64-bit 则为64位； file /sbin/init/sbin/init: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), fo ...

2012-07-12 10:30
浏览 826
评论(0)
分类:操作系统

linux 如何显示一个文件的某几行(中间几行)

博客分类：

linux

【一】从第3000行开始，显示1000行。即显示3000~3999行 cat filename | tail -n +3000 | head -n 1000 【二】显示1000行到3000行 cat filename| head -n 3000 | tail -n +1000 *注意两种方法的顺序分解： tail -n 1000：显示最后1000行 tail -n +1000：从1000行开始显示，显示1000行以后的

2012-07-12 10:13
浏览 964
评论(0)
分类:操作系统

24 Interview Questions & Answers for Hadoop MapReduce developers

博客分类：

hadoop

A good understanding of Hadoop Architecture is required to understand and leverage the power of Hadoop. Here are few important practical questions which can be asked to a Senior Experienced Hadoop Developer in an interview. This list primarily includes questions related to Hadoop Architecture, MapRed ...

2012-07-12 10:12
浏览 732
评论(0)
分类:开源软件

MAPR HBase and lzo installation

博客分类：

hbase

http://www.mapr.com/doc/display/MapR/HBase

2012-07-05 16:28
浏览 734
评论(0)
分类:开源软件

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

Limit Search To Specific Directory Level Using mindepth and maxdepth

HDFS架构简介

Linux rpm 命令参数使用详解［介绍和应用］

查看linux版本

Hadoop Dont's: What not to do to harvest Hadoop's full potential

hadoop-0.20.203启用LZO压缩安装成功

Hadoop TaskScheduler浅析

Hadoop OutputFormat浅析

Hadoop InputFormat浅析

hadoop面试可能遇到的问题

hadoop-0.20.203启用LZO压缩

查看Linux是32位还是64位

linux 如何显示一个文件的某几行(中间几行)

24 Interview Questions & Answers for Hadoop MapReduce developers

MAPR HBase and lzo installation

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

最近访客更多访客>>