Skip to content

Commit

Permalink
np.float_ -> np.float64
Browse files Browse the repository at this point in the history
  • Loading branch information
giovana-morais committed Aug 8, 2024
1 parent 0c938ba commit 7008b91
Show file tree
Hide file tree
Showing 3 changed files with 34 additions and 34 deletions.
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,15 @@
[![Supported Python versions](https://img.shields.io/pypi/pyversions/stempeg.svg)](https://pypi.python.org/pypi/stempeg)

Python package to read and write [STEM](https://www.native-instruments.com/en/specials/stems/) audio files.
Technically, stems are audio containers that combine multiple audio streams and metadata in a single audio file. This makes it ideal to playback multitrack audio, where users can select the audio sub-stream during playback (e.g. supported by VLC).
Technically, stems are audio containers that combine multiple audio streams and metadata in a single audio file. This makes it ideal to playback multitrack audio, where users can select the audio sub-stream during playback (e.g. supported by VLC).

Under the hood, _stempeg_ uses [ffmpeg](https://www.ffmpeg.org/) for reading and writing multistream audio, optionally [MP4Box](https://github.com/gpac/gpac) is used to create STEM files that are compatible with Native Instruments hardware and software.

#### Features

- robust and fast interface for ffmpeg to read and write any supported format from/to numpy.
- reading supports seeking and duration.
- control container and codec as well as bitrate when compressed audio is written.
- control container and codec as well as bitrate when compressed audio is written.
- store multi-track audio within audio formats by aggregate streams into channels (concatenation of pairs of
stereo channels).
- support for internal ffmpeg resampling furing read and write.
Expand Down Expand Up @@ -70,7 +70,7 @@ conda install -c conda-forge stempeg

Stempeg can read multi-stream and single stream audio files, thus, it can replace your normal audio loaders for 1d or 2d (mono/stereo) arrays.

By default [`read_stems`](https://faroit.com/stempeg/read.html#stempeg.read.read_stems), assumes that multiple substreams can exit (default `reader=stempeg.StreamsReader()`).
By default [`read_stems`](https://faroit.com/stempeg/read.html#stempeg.read.read_stems), assumes that multiple substreams can exit (default `reader=stempeg.StreamsReader()`).
To support multi-stream, even when the audio container doesn't support multiple streams
(e.g. WAV), streams can be mapped to multiple pairs of channels. In that
case, `reader=stempeg.ChannelsReader()`, can be passed. Also see:
Expand Down Expand Up @@ -121,7 +121,7 @@ Writing stem files from a numpy tensor can done with.
stempeg.write_stems(path="output.stem.mp4", data=S, sample_rate=44100, writer=stempeg.StreamsWriter())
```

As seen in the flow chart above, stempeg supports multiple ways to write multi-stream audio.
As seen in the flow chart above, stempeg supports multiple ways to write multi-stream audio.
Each of the method has different number of parameters. To select a method one of the following setting and be passed:

* `stempeg.FilesWriter`
Expand All @@ -136,8 +136,8 @@ Each of the method has different number of parameters. To select a method one of
Stem will be saved into a single multistream audio.
Additionally Native Instruments Stems compabible
Metadata is added. This requires the installation of
`MP4Box`.
`MP4Box`.

> :warning: __Warning__: Muxing stems using _ffmpeg_ leads to multi-stream files not compatible with Native Instrument Hardware or Software. Please use [MP4Box](https://github.com/gpac/gpac) if you use the `stempeg.NISTemsWriter()`
For more information on writing stems, see [`stempeg.write_stems`](https://faroit.com/stempeg/write.html#stempeg.write.write_stems).
Expand Down
36 changes: 18 additions & 18 deletions docs/read.html
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ <h1 class="title">Module <code>stempeg.read</code></h1>
duration (float): duration in seconds
dtype (numpy.dtype): Type of audio array to be casted into
stem_idx (int): stream id
ffmpeg_format (str): ffmpeg intermediate format encoding.
ffmpeg_format (str): ffmpeg intermediate format encoding.
Choose &#34;f32le&#34; for best compatibility

Returns:
Expand All @@ -123,10 +123,10 @@ <h1 class="title">Module <code>stempeg.read</code></h1>

# decode to raw pcm format
if ffmpeg_format == &#34;f64le&#34;:
# PCM 64 bit float
# PCM 64 bit float
numpy_dtype = &#39;&lt;f8&#39;
elif ffmpeg_format == &#34;f32le&#34;:
# PCM 32 bit float
# PCM 32 bit float
numpy_dtype = &#39;&lt;f4&#39;
elif ffmpeg_format == &#34;s16le&#34;:
# PCM 16 bit signed int
Expand All @@ -150,7 +150,7 @@ <h1 class="title">Module <code>stempeg.read</code></h1>
duration=None,
stem_id=None,
always_3d=False,
dtype=np.float_,
dtype=np.float64,
ffmpeg_format=&#34;f32le&#34;,
info=None,
sample_rate=None,
Expand Down Expand Up @@ -181,28 +181,28 @@ <h1 class="title">Module <code>stempeg.read</code></h1>
duration (float): Duration to load in seconds.
stem_id (int, optional): substream id,
defauls to `None` (all substreams are loaded).
always_3d (bool, optional): By default, reading a
always_3d (bool, optional): By default, reading a
single-stream audio file will return a
two-dimensional array. With ``always_3d=True``, audio data is
always returned as a three-dimensional array, even if the audio
file has only one stream.
dtype (np.dtype, optional): Numpy data type to use, default to `np.float32`.
info (Info, Optional): Pass ffmpeg `Info` object to reduce number
info (Info, Optional): Pass ffmpeg `Info` object to reduce number
of os calls on file.
This can be used e.g. the sample rate and length of a track is
already known in advance. Useful for ML training where the
info objects can be pre-processed, thus audio loading can
be speed up.
sample_rate (float, optional): Sample rate of returned audio.
sample_rate (float, optional): Sample rate of returned audio.
Defaults to `None` which results in
the sample rate returned from the mixture.
reader (Reader): Holds parameters for the reading method.
reader (Reader): Holds parameters for the reading method.
One of the following:
`StreamsReader(...)`
Read from a single multistream audio (default).
`ChannelsReader(...)`
Read/demultiplexed from multiple channels.
multiprocess (bool): Applys multi-processing for reading
multiprocess (bool): Applys multi-processing for reading
substreams in parallel to speed up reading. Defaults to `True`

Returns:
Expand Down Expand Up @@ -280,7 +280,7 @@ <h1 class="title">Module <code>stempeg.read</code></h1>
channels = min(_chans)
else:
raise RuntimeError(&#34;Stems do not have the same number of channels per substream&#34;)

# set channels to minimum channel per stream
stems = []

Expand Down Expand Up @@ -511,7 +511,7 @@ <h2 id="shape">Shape</h2>
duration=None,
stem_id=None,
always_3d=False,
dtype=np.float_,
dtype=np.float64,
ffmpeg_format=&#34;f32le&#34;,
info=None,
sample_rate=None,
Expand Down Expand Up @@ -542,28 +542,28 @@ <h2 id="shape">Shape</h2>
duration (float): Duration to load in seconds.
stem_id (int, optional): substream id,
defauls to `None` (all substreams are loaded).
always_3d (bool, optional): By default, reading a
always_3d (bool, optional): By default, reading a
single-stream audio file will return a
two-dimensional array. With ``always_3d=True``, audio data is
always returned as a three-dimensional array, even if the audio
file has only one stream.
dtype (np.dtype, optional): Numpy data type to use, default to `np.float32`.
info (Info, Optional): Pass ffmpeg `Info` object to reduce number
info (Info, Optional): Pass ffmpeg `Info` object to reduce number
of os calls on file.
This can be used e.g. the sample rate and length of a track is
already known in advance. Useful for ML training where the
info objects can be pre-processed, thus audio loading can
be speed up.
sample_rate (float, optional): Sample rate of returned audio.
sample_rate (float, optional): Sample rate of returned audio.
Defaults to `None` which results in
the sample rate returned from the mixture.
reader (Reader): Holds parameters for the reading method.
reader (Reader): Holds parameters for the reading method.
One of the following:
`StreamsReader(...)`
Read from a single multistream audio (default).
`ChannelsReader(...)`
Read/demultiplexed from multiple channels.
multiprocess (bool): Applys multi-processing for reading
multiprocess (bool): Applys multi-processing for reading
substreams in parallel to speed up reading. Defaults to `True`

Returns:
Expand Down Expand Up @@ -641,7 +641,7 @@ <h2 id="shape">Shape</h2>
channels = min(_chans)
else:
raise RuntimeError(&#34;Stems do not have the same number of channels per substream&#34;)

# set channels to minimum channel per stream
stems = []

Expand Down Expand Up @@ -1130,4 +1130,4 @@ <h4><code><a title="stempeg.read.StreamsReader" href="#stempeg.read.StreamsReade
<p>Generated by <a href="https://pdoc3.github.io/pdoc"><cite>pdoc</cite> 0.9.1</a>.</p>
</footer>
</body>
</html>
</html>
20 changes: 10 additions & 10 deletions stempeg/read.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ def _read_ffmpeg(
duration (float): duration in seconds
dtype (numpy.dtype): Type of audio array to be casted into
stem_idx (int): stream id
ffmpeg_format (str): ffmpeg intermediate format encoding.
ffmpeg_format (str): ffmpeg intermediate format encoding.
Choose "f32le" for best compatibility
Returns:
Expand All @@ -93,10 +93,10 @@ def _read_ffmpeg(

# decode to raw pcm format
if ffmpeg_format == "f64le":
# PCM 64 bit float
# PCM 64 bit float
numpy_dtype = '<f8'
elif ffmpeg_format == "f32le":
# PCM 32 bit float
# PCM 32 bit float
numpy_dtype = '<f4'
elif ffmpeg_format == "s16le":
# PCM 16 bit signed int
Expand All @@ -120,7 +120,7 @@ def read_stems(
duration=None,
stem_id=None,
always_3d=False,
dtype=np.float_,
dtype=np.float64,
ffmpeg_format="f32le",
info=None,
sample_rate=None,
Expand Down Expand Up @@ -151,28 +151,28 @@ def read_stems(
duration (float): Duration to load in seconds.
stem_id (int, optional): substream id,
defauls to `None` (all substreams are loaded).
always_3d (bool, optional): By default, reading a
always_3d (bool, optional): By default, reading a
single-stream audio file will return a
two-dimensional array. With ``always_3d=True``, audio data is
always returned as a three-dimensional array, even if the audio
file has only one stream.
dtype (np.dtype, optional): Numpy data type to use, default to `np.float32`.
info (Info, Optional): Pass ffmpeg `Info` object to reduce number
info (Info, Optional): Pass ffmpeg `Info` object to reduce number
of os calls on file.
This can be used e.g. the sample rate and length of a track is
already known in advance. Useful for ML training where the
info objects can be pre-processed, thus audio loading can
be speed up.
sample_rate (float, optional): Sample rate of returned audio.
sample_rate (float, optional): Sample rate of returned audio.
Defaults to `None` which results in
the sample rate returned from the mixture.
reader (Reader): Holds parameters for the reading method.
reader (Reader): Holds parameters for the reading method.
One of the following:
`StreamsReader(...)`
Read from a single multistream audio (default).
`ChannelsReader(...)`
Read/demultiplexed from multiple channels.
multiprocess (bool): Applys multi-processing for reading
multiprocess (bool): Applys multi-processing for reading
substreams in parallel to speed up reading. Defaults to `True`
Returns:
Expand Down Expand Up @@ -250,7 +250,7 @@ def read_stems(
channels = min(_chans)
else:
raise RuntimeError("Stems do not have the same number of channels per substream")

# set channels to minimum channel per stream
stems = []

Expand Down

0 comments on commit 7008b91

Please sign in to comment.