SoFunction
Updated on 2025-05-21

Decentralized storage of files based on MongoDB

1. Introduction

When the system has a large number of pictures, videos, documents and other files that need to be stored and managed, for distributed systems, how to store these files efficiently and reliably is a key issue. MongoDB's GridFS, as a distributed file storage mechanism, provides us with an excellent solution. It is based on MongoDB's distributed architecture, which can easily deal with the challenges of massive file storage, while providing a convenient file operation interface.

2. Analysis of GridFS principle

GridFS is a specification in MongoDB for storing large files. It splits the file into multiple smaller chunks (the default size is 256KB) and stores these chunks in the collection, while the file's metadata (such as file name, size, creation time, MIME type, etc.) is stored in the collection. This design can not only break through the limit of MongoDB's single document size (default 16MB), but also utilize the distributed characteristics of MongoDB to achieve distributed storage and efficient reading of files.

For example, when we upload a 1GB video file, GridFS will divide it into about 4096 256KB chunks, and then store these chunks on different MongoDB nodes, while recording the relevant information of the file in the collection.

3. Spring Boot Integration GridFS

In actual projects, we usually use Spring Boot and MongoDB. The following are specific integration steps and code examples.

3.1 Add dependencies

Add Spring Boot MongoDB-related dependencies to the file:

<dependencies>
    <dependency>
        <groupId></groupId>
        <artifactId>spring-boot-starter-data-mongodb</artifactId>
    </dependency>
    <dependency>
        <groupId></groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>
</dependencies>

3.2 Configuring MongoDB Connection

existConnection information for configuring MongoDB:

=mongodb://localhost:27017/fs
=fs

3.3 Writing a service class

useGridFsTemplateandGridFSBucketTo implement upload, download, delete and other operations of files:

@Service
publicclass MongoFsStoreService implements FsStoreService {
 
    privatefinal GridFsTemplate gridFsTemplate;
 
    private GridFSBucket gridFSBucket;
 
    public MongoFsStoreService(GridFsTemplate gridFsTemplate) {
         = gridFsTemplate;
    }
 
    @Autowired(required = false)
    public void setGridFSBucket(GridFSBucket gridFSBucket) {
         = gridFSBucket;
    }
 
    /**
      * Upload file
      * @param in
      * @param fileInfo
      * @return
      */
    @Override
    public FileInfo uploadFile(InputStream in, FileInfo fileInfo){
        ObjectId objectId = (in, (), (), fileInfo);
        (());
        return fileInfo;
    }
 
    /**
     *
     * @param in
     * @param fileName
     * @return
     */
    @Override
    public FileInfo uploadFile(InputStream in, String fileName) {
        FileInfo fileInfo = (in, fileName);
        return uploadFile(in, fileInfo);
    }
 
    /**
     *
     * @param fileId
     * @return
     */
    @Override
    public File downloadFile(String fileId){
        GridFsResource gridFsResource = download(fileId);
        if( gridFsResource != null ){
            GridFSFile gridFSFile = ();
            FileInfo fileInfo = ((), );
 
            try(InputStream in = ()) {
                return ( in, () ); //
            } catch (IOException e) {
                thrownew RuntimeException(e);
            }
        }
        returnnull;
    }
 
    /**
      * Find files
      * @param fileId
      * @return
      */
    public GridFsResource download(String fileId) {
        GridFSFile gridFSFile = ((().is(fileId)));
        if (gridFSFile == null) {
            returnnull;
        }
 
        if( gridFSBucket == null ){
            return (());
        }
        GridFSDownloadStream downloadStream = (());
        returnnew GridFsResource(gridFSFile, downloadStream);
    }
 
    /**
      * Delete the file
      * @param fileId
      */
    @Override
    public void deleteFile(String fileId) {
        ((().is(fileId)));
    }
 
}
 

3.4 Creating a Controller

Provides a REST API interface to facilitate external calls:

@RestController
@RequestMapping("/mongo")
publicclass MongoFsStoreController {
 
    privatefinal MongoFsStoreService mongoFsStoreService;
 
    public MongoFsStoreController(MongoFsStoreService mongoFsStoreService) {
         = mongoFsStoreService;
    }
 
    /**
     *
     * @param file
     * @return
     */
    @RequestMapping("/upload")
    public ResponseEntity&lt;Result&gt; uploadFile(@RequestParam("file") MultipartFile file){
        try(InputStream in = ()){
            FileInfo fileInfo = convertMultipartFile(file);
            return ( ((in, fileInfo)) );
        }catch (Exception e){
            return ( (HttpStatus.INTERNAL_SERVER_ERROR.value(), ()) );
        }
    }
 
    private FileInfo convertMultipartFile(MultipartFile file){
        FileInfo fileInfo = new FileInfo();
        ((()));
        (().toString() + "." + ()); //
        (());
        (());
        (());
        (new Date());
        return fileInfo;
    }
 
    /**
     *
     * @param fileId
     * @param response
     */
    @RequestMapping("/download")
    public void downloadFile(@RequestParam("fileId") String fileId, HttpServletResponse response){
        File file = (fileId);
        if( file != null ){
            ("application/octet-stream");
            ("Content-Disposition", "attachment; filename=\"" + () + "\"");
            try {
                (file, ());
            } catch (IOException e) {
                thrownew RuntimeException(e);
            }
        }
    }
 
    @RequestMapping("/download/{fileId}")
    public ResponseEntity&lt;InputStreamResource&gt; download(@PathVariable("fileId") String fileId) throws IOException {
        GridFsResource resource = (fileId);
        if( resource != null ){
            GridFSFile gridFSFile = ();
            FileInfo fileInfo = ((), );
 
            return ()
                    .header(HttpHeaders.CONTENT_DISPOSITION, "attachment; filename=\"" + () + "\"")
                    .contentLength(())
//                    .contentType((()))
                    .body(new InputStreamResource(()));
        }
//        return ().build();
        return ().build();
    }
 
    /**
     *
     * @param fileId
     * @return
     */
    @RequestMapping("/delete")
    public ResponseEntity&lt;String&gt; deleteFile(@RequestParam("fileId") String fileId){
        (fileId);
        return ("Delete successfully");
    }

4. Frequently Asked Questions and Solutions in Practical Techniques

4.1 Memory management when downloading files

When downloading files, GridFSDownloadStream provides streaming capabilities to avoid loading the entire file into memory at once. We can wrap the stream through GridFsResource and return it directly to the client, realizing the reading and transmission, thereby saving memory. For example:

// Correct: Return to InputStreamResource directly, and pass while readingreturn ()
       .body(new InputStreamResource(()));

Instead, you should avoid reading the entire file into a byte array and then returning it, as shown in the following error example:

// Error: Load the entire file into memory and returnbyte[] content = ().readAllBytes(); 
return ()
       .body(content);

5. Summary

The distributed file storage solution based on MongoDB GridFS provides us with an efficient and reliable file storage method thanks to its unique file chunking storage principle and its close integration with MongoDB distributed architecture. Through the integration of Spring Boot, we can quickly implement files upload, download, query and delete functions in our projects. In actual application, we need to pay attention to common problems such as memory management, data type conversion, time type processing, etc., and adopt appropriate solutions. With the continuous development of technology, GridFS is also continuously optimized and improved, which will provide strong support for more distributed file storage scenarios.

For small and medium-sized file storage, GridFS is a simple and efficient choice; for hyperscale files or scenarios that require extreme performance, you can consider using it in combination with object storage (such as MinIO, S3).

The above is the detailed content of the distributed storage of files based on MongoDB. For more information about distributed storage of MongoDB files, please pay attention to my other related articles!