1. Installation
Just use pip to install it (you also need to install ffmpeg dependency, it is recommended to use conda command to install it, then you don't need to configure the environment):
pip install pydub
2. Importing and reading audio files
from pydub import AudioSegment audio = AudioSegment.from_file("path/to/file")
3. Audio playback
from import play play(audio)
4. Audio duration
duration = audio.duration_seconds # in seconds
5. Audio cutting
# The first 10 seconds audio = audio[:10000] # 10 seconds after audio = audio[-10000:] # Starting at the 10th second and ending at the 20th second # audio = audio[10000:20000] # From the 10th second to the end audio = audio[10000:] # From start to 10th second audio = audio[:10000]
6. Audio merging
audio1 = AudioSegment.from_file("path/to/file1") audio2 = AudioSegment.from_file("path/to/file2") audio_combined = audio1 + audio2
7. Audio conversion
("path/to/new/file", format="mp3")
8. Volume adjustment
# 10 dB increase louder_audio = audio + 10 # Decrease by 10 decibels quieter_audio = audio - 10
9. Splitting audio equally
# Split in equal parts, at roughly three-minute intervals # for i in range(1, 1000): if 3.3 >= (audio.duration_seconds / (60 * i)) >= 2.8: number = i break chunks = audio[::int(audio.duration_seconds / number * 1000 + 1)] # Cutting # Save split audio for i, chunk in enumerate(chunks): ("path/to/new/file{}.wav".format(title,i), format="wav")
10. Complete code
Below is a complete code for cutting the audio back and forth and splitting the audio into small segments of appropriate length for saving.
from pydub import AudioSegment # Reading audio files audio = AudioSegment.from_file("path/to/file") # of hours of output video print('Video Duration:', audio.duration_seconds / 60) # Front and back cutting start = int(input('First cut n seconds, no cut enter 0'))*1000 end = int(input('Back cut n seconds, no cut enter 0'))*1000 if start: audio = audio[start:-end] # Calculate the proper split length for i in range(1, 1000): if 3.3 >= (audio.duration_seconds / (60 * i)) >= 2.8: number = i break chunks = audio[::int(audio.duration_seconds / number * 1000 + 1)] # Save split audio for i, chunk in enumerate(chunks): print('Duration after splitting:', chunk.duration_seconds / 60) ("path/to/new/file{}.wav".format(i), format="wav")
These are the main points of pydub and a complete example. With pydub, we can easily process and convert audio, making our audio processing more efficient and convenient.
In addition, some other use cases for pydub are listed below.
Application Cases
1. Convert audio files to a specified format
from pydub import AudioSegment # Reading audio files audio = AudioSegment.from_file("path/to/file") # Convert to mp3 and save ("path/to/new/file.mp3", format="mp3")
2. Merge multiple audio files into a single file
from pydub import AudioSegment # Reading audio files audio1 = AudioSegment.from_file("path/to/file1") audio2 = AudioSegment.from_file("path/to/file2") # Merge audio files and save combined_audio = audio1 + audio2 combined_audio.export("path/to/new/file", format="wav")
3. Ringtone production
from pydub import AudioSegment # Reading audio files audio = AudioSegment.from_file("path/to/file") # Cut and save start = 10000 end = 15000 ringtone = audio[start:end] ("path/to/new/file", format="mp3")
4. Adjusting audio volume
from pydub import AudioSegment # Reading audio files audio = AudioSegment.from_file("path/to/file") # 10 dB increase louder_audio = audio + 10 # Decrease by 10 decibels quieter_audio = audio - 10 # Save the adjusted audio louder_audio.export("path/to/new/file", format="wav") quieter_audio.export("path/to/new/file", format="wav")
Case: Segmenting songs in audio by recognizing blank sounds
from pydub import AudioSegment from import split_on_silence # Reading audio files audio = AudioSegment.from_file("audio.mp3", format="mp3") # Setting the segmentation parameters min_silence_len = 700 # Minimum mute length silence_thresh =-10 # Mute thresholds, smaller and tighter keep_silence = 600 # Reserved mute length # Calculate the number of splits num_segments = int(audio.duration_seconds/60/3) # Roughly three minutes per song, counting songs # # Split the audio file for i in range(-10, 0): segments = split_on_silence(audio, min_silence_len=min_silence_len, silence_thresh=i, keep_silence=keep_silence) if len(segments) <= num_segments: print(f"Split Success,Split out a total of {len(segments)} stage (of a process)") break else: print(f"The current threshold is {i},break out {len(segments)} stage (of a process),Keep trying.")
First, we use AudioSegment.from_file() method to read the audio file and set the segmentation parameters min_silence_len, silence_thresh and keep_silence to represent the minimum silence length, silence threshold and keep silence length respectively. Among them, the smaller the mute threshold is, the more small segments are segmented, but mis-segmentation may occur; on the contrary, the larger the mute threshold is, the fewer small segments are segmented, but leakage segmentation may occur.
Then we calculate the number of segments num_segments, i.e. how many segments the audio file is split into. Here we assume that each song is about three minutes long, and calculate the total number of segments we need to split into.
Finally, we use the split_on_silence() method to split the audio file, set the splitting parameters, and keep adjusting the silence threshold by looping until the number of small segments split is as expected. If the split is successful, the loop is skipped; otherwise, keep trying.
All in all, pydub is a very useful audio processing library, which can be easily used for audio processing, conversion, merging and other operations. At the same time, pydub also has a wealth of application scenarios, such as making ringtones, adjusting the volume and so on. It is worth noting that in the process of using pydub, you need to pay attention to the compatibility of audio formats.
In addition, audio can be coded, decoded, mixed, resampled, etc. with pydub. Here are some examples of common operations.
Codecs, mixing, resampling
1. Codecs
from pydub import AudioSegment # Reading audio files audio = AudioSegment.from_file("path/to/file") # Coding encoded_audio = audio.set_frame_rate(16000).set_sample_width(2).set_channels(1) # Decoding decoded_audio = encoded_audio.set_frame_rate(44100).set_sample_width(4).set_channels(2)
2. Mixing
from pydub import AudioSegment # Reading audio files audio1 = AudioSegment.from_file("path/to/file1") audio2 = AudioSegment.from_file("path/to/file2") # Mixing mixed_audio = (audio2) # Save the audio after mixing mixed_audio.export("path/to/new/file", format="wav")
3. Resampling
from pydub import AudioSegment # Reading audio files audio =AudioSegment.from_file("path/to/file") # Resampling to 44100 Hz resampled_audio = audio.set_frame_rate(44100) # Save resampled audio resampled_audio.export("path/to/new/file", format="wav")
With pydub, we can easily perform audio codecs, mixing, resampling and other operations, further expanding the application scenarios of pydub. It is important to note that when performing audio mixing operations, you need to ensure that the sampling rate, the number of sampling bits and the number of channels of the two audio files are the same.
Finally, summarize the pros and cons of pydub.
Pros:
Lightweight: pydub is a lightweight audio processing library that is easy to install and simple to use.
Feature-rich: pydub provides rich audio processing functions, including cutting, merging, converting, adjusting volume, codecs, mixing, resampling and so on.
Wide range of applications: pydub has a wide range of application scenarios, including audio processing, ringtone production, audio format conversion, speech recognition and so on.
Drawbacks:
Limited compatibility with formats: pydub has limited compatibility with audio formats and does not support all audio formats, you need to convert the audio to a supported format before processing.
Mediocre performance: pydub may have mediocre performance when dealing with large files, requiring some time and computational resources.
Doesn't support streaming: pydub doesn't support streaming, you need to read the whole audio file into memory, resulting in a large memory footprint.
In summary, pydub is a feature-rich, widely used audio processing library. When using pydub, you need to pay attention to audio format compatibility issues, and pay attention to performance and memory usage when processing large files. If you need to handle more complex audio tasks, you can consider using other more specialized audio processing libraries.
summarize
The blog describes how to split an audio file into multiple small segments using Python's pydub library. We first read the audio file, then set the segmentation parameters and calculate the number of segments. Finally, a loop is used to keep adjusting the mute threshold until the number of small segments split is as expected. This approach can be used to process and analyze audio, such as speech recognition, music recommendation, etc.
It should be noted that the effect of audio segmentation is affected by the segmentation parameters, which need to be adjusted according to the specific situation. In addition, the segmented segments may have mis-segmentation and missed segmentation, which need to be subsequently checked and processed.
Above is the Python audio processing library pydub use tutorial details, more information about Python audio processing library pydub please pay attention to my other related articles!