Home

ArtyShield APIs Pricing About Us Blog Community Support Contact Sign In

MusicShield

AI protection for musicians

MusicShield protects music by modifying the machine-perceived acoustic and musical features that generative AI systems rely on, while remaining natural and unchanged to human listeners.

By leveraging the perception gap between humans and machines, MusicShield makes audio far harder for AI models to interpret and learn from while preserving listener experience.

Try MusicShield View Pricing

Core Technology

Listener-Natural, Machine-Disruptive by Design

MusicShield applies feature-aware perturbations to the acoustic and musical cues AI systems rely on for captioning, representation learning, and downstream training; these perturbations are constrained by perceptual guardrails and controllable protection strength, preserving natural listening quality while shifting model-facing representations enough to reduce reliable machine interpretation and reuse.

Targets machine-relevant features without degrading human listening experience.
Produces measurable caption and semantic drift across foundation models.
Supports configurable protection levels for different release and workflow needs.

470+

Music professionals in user studies.

Psychoacoustic Masking

Perceptual Guardrail

Uses psychoacoustic masking constraints to preserve listener-facing quality for artists, platforms, and audiences while reducing machine interpretability.

Protection Strength 1-10

Controllable Strength

Adjust protection intensity on a 1-10 scale for demos, releases, and platform integrations based on your workflow and risk tolerance.

Formats, Rates, Bitrates

Broad Compatibility

Supports mainstream formats and common sample-rate/bitrate settings, with output configuration aligned to the input audio.

Research Validation

MusicShield is built on peer-reviewed research accepted to IEEE Symposium on Security and Privacy (S&P 2026).

The core algorithm has been significantly improved and re-engineered for scalability, robustness, and deployment efficiency, enabling consistent protection across large music catalogs and diverse distribution pipelines.

Case Studies

Caption-Level Protection Comparison

Each case compares original and MusicShield-protected tracks across state-of-the-art audio-language models and shows clear shifts in generated captions and semantic descriptions, including genre, mood, instrumentation, harmonic detail, and vocal characterization. Because these outputs are derived from learned audio representations, the differences provide interpretable evidence that MusicShield changes underlying machine-relevant features, not just surface-level signals, while preserving natural listening quality for people. This is especially relevant in modern music training pipelines, where audio-language models are often used to automatically generate music-caption pairs for text-to-music training; by altering machine-perceived semantics, MusicShield can reduce the reliability of such auto-captioned pairs for downstream reuse.

Example 1: Funky Firestorm

Across four foundation models, this case consistently drifts from heavy, riff-driven hard rock/metal toward lighter punk/indie-pop-rock framings, with repeated changes in vocal presence (including non-melodic spoken delivery), mood interpretation, and production texture (polished metal weight vs raw/lo-fi or pop-forward mixes), indicating substantial machine-perceived feature re-mapping after protection.

Original Track

0:00 / 0:00

Protected Track¹

0:00 / 0:00

Model	Original Caption	Protected Caption	Protection Analysis
Music Flamingo	Loading full caption...	Loading full caption...	Loading full analysis...
Audio Flamingo 3	Loading full caption...	Loading full caption...	Loading full analysis...
Gemini 2.5	Loading full caption...	Loading full caption...	Loading full analysis...
Qwen3-Omni	Loading full caption...	Loading full caption...	Loading full analysis...

Example 2: Oh, Marge

Across models, protected audio is repeatedly reinterpreted away from country-pop toward indie pop/rock and pop-rock ballad framings, with reduced country-signature cues (e.g., twang/pedal-steel/harmonica and character-specific narrative wording) and stronger mainstream electric/synth-pop production descriptions, showing robust and consistent machine-perceived feature drift rather than isolated wording variation.

Original Track

0:00 / 0:00

Protected Track¹

0:00 / 0:00

Model	Original Caption	Protected Caption	Protection Analysis
Music Flamingo	Loading full caption...	Loading full caption...	Loading full analysis...
Audio Flamingo 3	Loading full caption...	Loading full caption...	Loading full analysis...
Gemini 2.5	Loading full caption...	Loading full caption...	Loading full analysis...
Qwen3-Omni	Loading full caption...	Loading full caption...	Loading full analysis...

¹ Protected clips are currently generated at protection strength 5 (range: 1 to 10). This setting can be adjusted to balance audio quality and protection level based on user needs.
² Protection Score reflects the degree of machine-perceived feature shift between the original and protected audio. It is computed using caption-based comparisons via an LLM judge. This metric is intended as a reference signal only and may not fully align with every downstream model's behavior.

Downstream Evaluation

Downstream Model Training Behavior

To evaluate downstream protection strength, we trained two text-to-music MusicGen models with the same setup: one on original tracks and one on MusicShield-protected tracks. We assess model capability using CLAP_score and KNN_common, two commonly used metrics in AI music research, together with prompt-matched generation examples. This controlled comparison isolates how protected training audio affects caption alignment and training-data neighborhood reuse.

Statistical Comparison

CLAP_score and KNN_common are commonly used metrics in AI music research to evaluate generation-model capability. The model trained on MusicShield-protected tracks shows substantially lower caption alignment and much lower nearest-neighbor overlap with source training music compared to the model trained on original tracks.

Metric	Model Trained on Original Tracks	Model Trained on Protected Tracks	Interpretation
CLAP_score	0.342	0.160	Measures caption-audio semantic alignment in a shared embedding space. A CLAP_score of 0.160 indicates very weak and unreliable alignment between prompt meaning and generated audio: genre and mood cues are not consistently reflected, and listeners may struggle to identify a clear semantic match.
KNN_common	0.778	0.210	Measures overlap in top-K nearest training-song chunks between generated audio and the training music associated with generation captions. A low value (0.210) indicates substantially reduced neighborhood overlap and weaker carry-over of training audio identity.

KNN_common can be interpreted as overlap ratio; for example, a score of 0.84 corresponds to 84% overlap.

Prompt-Matched Generation Examples

The examples below use identical prompts across both models and compare outputs from the model trained on original tracks versus the model trained on MusicShield-protected tracks. This is a controlled relative-capability test, not a benchmark of production-level audio quality. The training set is intentionally modest compared with large systems trained on thousands of licensed hours, so generations may sound less plausible; even so, the protected-trained model consistently shows lower CLAP_score and KNN_common, indicating clear protection effectiveness.

Prompt	Original-Trained Output	Protected-Trained Output	Comparison Note
Classical Contemporary classical music with melancholic to uplifting mood.	0:00 / 0:00	0:00 / 0:00	The protected-trained output is less semantically stable.
Jazz Smooth jazz music with mellow and relaxed groove.	0:00 / 0:00	0:00 / 0:00	Prompt-faithful style and mood consistency are lower.

Prompt

Original-Trained Output

Protected-Trained Output

Comparison Note

Classical

Contemporary classical music
with melancholic to uplifting mood.

0:00 / 0:00

The protected-trained output is less semantically stable.

Jazz

Smooth jazz music
with mellow and relaxed groove.

0:00 / 0:00

Prompt-faithful style and mood consistency are lower.

Ready to Protect Your Music Catalog?

Deploy MusicShield through dashboard or API to keep tracks natural for listeners while reducing machine interpretability in AI training and analysis pipelines, with controllable protection strength based on your workflow.

Dashboard + API Access Controllable Protection Strength Caption-Level Evidence

Start MusicShield Talk to Sales

MusicShield

Listener-Natural, Machine-Disruptive by Design

Caption-Level Protection Comparison

Example 1: Funky Firestorm

Original Track

Protected Track1

Example 2: Oh, Marge

Original Track

Protected Track1

Downstream Model Training Behavior

Statistical Comparison

Prompt-Matched Generation Examples

Ready to Protect Your Music Catalog?

Privacy Policy

1. Scope of This Policy

2. Information We Collect

A. Information You Provide to Us

B. Information We Collect Automatically

3. How We Use Information

4. Uploaded Media, Files, and Generated Data

5. How We Share Information

6. Cookies and Similar Technologies

7. Data Security

8. Data Retention

9. Children

10. Regional Privacy Rights

11. International Transfers

12. Third-Party Services

13. Changes to This Policy

14. Contact Us

Terms of Service

1. Eligibility

2. Privacy

3. Accounts

4. The Services

5. Beta Services

6. License to Use the Services

7. User Content

8. AI/Model Training Use of User Content

9. Outputs

10. Retention and Deletion

11. Acceptable Use

12. APIs and Usage Limits

13. Fees, Billing, and Trials

14. Third-Party Services

15. Intellectual Property Rights

16. Feedback

17. Suspension and Termination

18. Disclaimers

19. Limitation of Liability

20. Indemnification

21. Export and Sanctions Compliance

22. Dispute Resolution; Governing Law; Venue

23. Changes to the Terms

24. General

25. Contact

Protected Track¹

Protected Track¹