SoFunction
Updated on 2024-11-15

Detailed procedure to put imagenet2012 data into tfrecords format for tensorflow and run validation

Download the code for tensorflow

Address:

/tensorflow/

Then go to the catalog:

cd models/research/slim/datasets/

Download Imagenet 2012 Dataset

You can register for the download on the official website or:

https:///article/

I put the data under the tensorflow path here:

./models/research/slim/datasets/imagenet2012

models is also the path to the tensorflow code downloaded above, and imagenet2012 is the directory you created and then downloaded:

The red one is the dataset I'm going to use, in itself I'm aiming to do the evaluation and shouldn't be able to use ILSVRC2012_bbox_train_v2.tar, but when transferring the data it reported that I couldn't find certain files, so I've added it as well, with the suffixes V3 V2 for the different tasks.
The blue one needs to create a directory for subsequent decompression of the dataset.
The reference for processing data is Huawei's documentation:

/enterprise/zh/doc/EDOC1100191905/a8d9a8a2

You can prepare an unzip script to unzip to the corresponding directory:

#!/bin/bash
# mkdir -p train val bbox imagenet_tf
tar -xvf ILSVRC2012_img_train.tar -C train/
tar -xvf ILSVRC2012_img_val.tar -C val/
tar -xvf ILSVRC2012_bbox_train_v2.tar -C bbox/
tar -xvf ILSVRC2012_bbox_val_v3.tgz -C bbox/

conversions

Let's go to the script first, and then talk about how to change the contents of the python file used in the script before execution.

python preprocess_imagenet_validation_data.py ./imagenet2012/val/ imagenet_2012_validation_synset_labels.txt
python process_bounding_boxes.py ./imagenet2012/bbox/ imagenet_lsvrc_2015_synsets.txt | sort > imagenet_2012_bounding_boxes.csv
python build_imagenet_data.py --output_directory=./imagenet2012/imagenet_tf --validation_directory=./imagenet2012/val

All three scripts are in the . /models/research/slim/datasets directory, we know that tensorflow itself across the version of the code before there is a big difference, such as build_imagenet_data.py, most of the scripts have been 2 years ago, and now a lot of new environments, such as python3, direct execution will report a lot of errors. There are a lot of errors, see how to change it, ref:

https:///article/

The first one changes to its own data path:

The blue color is changed to its own corresponding red color:

You can see that the path of the train data and the path of the output are the same as the path of the val, otherwise I can't find the n01440764, here I think my data is still a problem.

The second modifies the return type of range

About 500 lines or so:

# Turns out shuffled_index = range(len(filenames)), plus list() reads the following:
shuffled_index = list(range(len(filenames)))

Modify bytes

The blue color was changed to red, and the green color was said to be changed by many users, but I changed it here instead and reported an error.

Read/write mode adjustment

For blue read red:

Match python3

Plus judgment:

Then it can be converted and the result is:

Run a verification.

python eval_image_classifier.py \
  --checkpoint_path='./weights' \
  --eval_dir='./log/' \
  --dataset_name=imagenet \
  --dataset_split_name=validation \
  --dataset_dir='./datasets/imagenet2012/imagenet_tf/' \
  --model_name=resnet_v1_50
parameters account for
checkpoint_path The parameter can receive either a directory path or a file path. If it is a directory path, it will look for the latest model in this directory
eval_dir Directory where the execution result logs are saved
dataset_name In my case it's imagenet, which needs to correspond to the mission dataset
dataset_split_name Specify the dataset to be performed. Note that here the validation is performed using the validation set
dataset_dir tfrecords data location
model_name The name of the model, corresponding to the checkpoint_path path under the

The following is printed after execution:

eval/Accuracy[0.51]
eval/Recall_5[0.973333336]

Accuracy denotes the classification accuracy of the model and Recall_5 denotes the accuracy of the first 5 times

To this article on the imagenet2012 data for tensorflow tfrecords format and run the validation of the article is introduced to this, more related to tensorflow imagenet2012 data content, please search for my previous articles or continue to browse the following related articles I hope you will support me more in the future!