Ошибка при запуске файла jar в hadoop

При запуске файла jar в hadoop я получаю исключение из null указателя. Я не могу понять, в чем проблема.Ошибка при запуске файла jar в hadoop

Ниже мой водитель Класс:

package mapreduce; 

import java.io.*; 

import org.apache.hadoop.fs.Path; 
import org.apache.hadoop.conf.*; 
import org.apache.hadoop.io.*; 
import org.apache.hadoop.mapred.*; 
import org.apache.hadoop.util.*; 


public class StockDriver extends Configured implements Tool 
{ 
     public int run(String[] args) throws Exception 
     { 
      //creating a JobConf object and assigning a job name for identification purposes 
      JobConf conf = new JobConf(getConf(), StockDriver.class); 
      conf.setJobName("StockDriver"); 

      //Setting configuration object with the Data Type of output Key and Value 
      conf.setOutputKeyClass(Text.class); 
      conf.setOutputValueClass(IntWritable.class); 

      //Providing the mapper and reducer class names 
      conf.setMapperClass(StockMapper.class); 
      conf.setReducerClass(StockReducer.class); 

      File in = new File(args[0]); 
      int number_of_companies = in.listFiles().length; 
      for(int iter=1;iter<=number_of_companies;iter++) 
      { 
       Path inp = new Path(args[0]+"/i"+Integer.toString(iter)+".txt"); 
       Path out = new Path(args[1]+Integer.toString(iter)); 
       //the HDFS input and output directory to be fetched from the command line 
       FileInputFormat.addInputPath(conf, inp); 
       FileOutputFormat.setOutputPath(conf, out); 
       JobClient.runJob(conf); 
      } 
      return 0; 
     } 

     public static void main(String[] args) throws Exception 
     { 
      int res = ToolRunner.run(new Configuration(), new StockDriver(),args); 
      System.exit(res); 
     } 
}

Mapper Класс:

package mapreduce; 

import java.io.IOException; 
import gonn.ConstraintTree; 

import org.apache.hadoop.io.*; 
import org.apache.hadoop.mapred.*; 

public class StockMapper extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> 
{ 
     //hadoop supported data types 
     private static IntWritable send; 
     private Text word; 

     //map method that performs the tokenizer job and framing the initial key value pairs 
     public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException 
     { 
      //taking one line at a time and tokenizing the same 
      String line = value.toString(); 
      String[] words = line.split(" "); 
      String out = ConstraintTree.isMain(words[1]); 
      word = new Text(out); 

      send = new IntWritable(Integer.parseInt(words[0])); 
      output.collect(word, send); 
     } 
}

Редуктор Класс:

package mapreduce; 

import java.io.IOException; 
import java.util.Iterator; 

import org.apache.hadoop.io.*; 
import org.apache.hadoop.mapred.*; 

public class StockReducer extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> 
{ 
     //reduce method accepts the Key Value pairs from mappers, do the aggregation based on keys and produce the final output 
     public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException 
     { 
      int val = 0; 

      while (values.hasNext()) 
      { 
       val += values.next().get(); 
      } 
      output.collect(key, new IntWritable(val)); 
     } 
}

Трассировка стека:

Exception in thread "main" java.lang.NullPointerException 
    at mapreduce.StockDriver.run(StockDriver.java:29) 
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) 
    at mapreduce.StockDriver.main(StockDriver.java:44) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:606) 
    at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

Когда я попытался запустить файл jar с помощью java -jar myfile.jar args..., он работает нормально. Но когда я попробовал запустить его на кластере hadoop, используя hadoop jar myfile.jar [MainClass] args..., давал ошибку.

Просто чтобы прояснить, линия 29 является int number_of_companies = in.listFiles().length;

источник

2014-09-25 Darshil Babel

Выполняете ли вы отдельные задания MR для каждого файла в arg [0]? – blackSmith

@blackSmith Нет, я использую то же самое задание Mapreduce в цикле для каждого файла. –

Причиной этой проблемы является использование File API для чтения HDFS файлов. Если вы создаете объект File с несуществующим путем, метод listFiles возвращает null. Как ваш входной каталог в HDFS (я предполагаю), это не существует для локальной файловой системы, то NPE является пришедшим от:

in.listFiles().length

Используйте следующее для извлечения количества файлов в HDFS директории:

FileSystem fs = FileSystem.get(new Configuration()); 
int number_of_companies = fs.listStatus(new Path(arg[0])).length;

источник

2014-09-25 09:35:30 blackSmith

Ошибка при запуске файла jar в hadoop

ответ

Смежные вопросы